Novell Network Management: NetWare 6

Chapter 7: Optimize eDirectory Performance

 

Objectives:

This chapter concerns replication and synchronizing replicas, eDirectory administration, and network time synchronization. The objectives important to this chapter are found on page 7-1:

  1. Define eDirectory Replication and Synchronization
  2. Identify Basic eDirectory Administrative Procedures
  3. Design and Implement a Time Synchronization Strategy
Concepts:

Define eDirectory Replication and Synchronization

Partitioning is defined as simply dividing eDirectory into sections . Partitions are logical divisions of eDirectory that make it possible to divide the database among servers. Each copy of one of these partitions is called a replica. Note: eDirectory can be stored on NetWare, Windows, or Linux servers. It is important to partition, replicate, and synchronize eDirectory. One reason that it is more important in NetWare 6 is that older versions of NDS were limited to about 3500 objects in a tree. eDirectory can hold millions of objects in one tree.

The diagram on page 7-3 illustrates several basic facts about partitions:

  • each eDirectory object must exist in one and only one partition
  • partitions cannot overlap
  • partitions are named for the topmost container in them, which is called the partition root
  • the boundaries or partitions are defined by containers
  • partitions have parent-child relationships, just like directories

The data within a partition is stored in a replica. Replicas also contain information about the servers they are stored on: the server name, the server type, the server's address, and a set of pointers to servers that hold other replicas of this partition. The set of servers that hold replicas of a partition are called that partition's replica ring. They come in six types (sort of):

  • Master - you can make object changes here, and this is the only copy in which you can make partition changes; there can be only one master copy of each partition at one time. In case of damage to the master replica, another replica can become the new master.
  • Read/Write - you can make object changes here, but not partition changes; you can have as many of these as you want
  • Read-only - you cannot make any changes here; change requests are forwarded to the nearest replica that can make them. Novell does not recommend that you make this type, since it causes more network traffic.
  • Subordinate Reference - this is not really a replica, since it contains no eDirectory data. It is really a simple concept: if a server holds a replica of a parent partition, but holds no replica of that parent's child partition, the server is given a subordinate reference to the child partition. In this way, information can be found more easily. The drawback is that it causes more traffic and overhead for the servers.
  • Filtered Replicas - Filtered replicas contain only part of the information stored in a partition. They can be used to provide eDirectory information when only some information is needed. You can use ConsoleOne to create a Sparse Replica or a Fractional Replica. Sparse replicas provide access only to the objects you choose, fractional replicas provide access to only the properties you choose of those objects.

Novell recommends that you design your strategy so that there are few subordinate references. It is also recommended that there is no need to create Read-only replicas, since they also increase traffic.

Several reasons are given for creating eDirectory partitions:

  • eDirectory fault tolerance. Note that this does not add fault tolerance to the File System. Only eDirectory information is stored in partition replicas.
  • Efficient access to eDirectory information. Partitions are efficient if they provide smaller sections of the database, close to users who need the information in those sections.
  • Scalability. Partitioning allows the network to grow without overtaxing the capacity of one database server.
  • Name resolution. eDirectory uses server names to find data that it needs.
  • Synchronization performance. A design consideration is to set up partitions so that it takes little time to copy all changes from one replica to another. Novell recommends that your design should synchronize in 30 to 60 minutes.
  • Manageability. Like other features of NetWare, partitions can be centrally administered.

For a replica to be useful, it must match the other replicas of its partition. (Remember, a replica only holds data about one partition.) eDirectory is a loosely consistent database. This means that there can be changes in one replica that other replicas do not know about. This condition is undesirable and is not meant to last for long, so changes in replicas are propagated, sent from the changed replica to all other replicas in a replica list. Since no changes are ever made in a Read-only replica, it can only receive changes made elsewhere. The process of converging the replicas, bringing them into a matching state, is called synchronization. Only changes made to eDirectory objects are sent in synchronization, not the entire objects.

Changes can be simple or complex. When a user logs in, the user's object changes to reflect the last login time and date. This is a simple change, and Novell considers it important to security to propagate this change to all replicas of the user's partition immediately. Complex changes are those that affect multiple objects, such as changes in partition boundaries. Complex changes take longer to propagate and synchronize, the more objects and properties that are affected, the longer it will take to synchronize.

Page 7-8 shows a graphic representation of a replica list, also called a replica ring. Note that the graphic shows a default strategy: one partition, one Master replica, two Read/Write replicas, and three servers, each holding one of the replicas. You can access a graphic view of replica rings from ConsoleOne or iMonitor. Note that the icons for each replica type are different:

  • Master replica
  • Read/Write replica

Partition and Replication rules are discussed on page 7-12. Much detail is given to each point:

  • Installing Servers
    • The first partition is created when the first server is installed in a new tree. That server gets the Master replica of that partition by default. No other server installation creates a partition.
    • The second and third servers installed in a partition receive Read/Write replicas of that partition by default. Typically, other servers installed in a partition do not automatically receive replicas.
      1. Variation: whenever a server is installed in a partition, if three replicas of that partition do not exist, the new server will receive a replica. This is the same rule, just a special case of it.
      2. Variation: whenever you upgrade a NetWare 3 server to NetWare 4, 5, or 6, it will receive a Read/Write replica of the partition containing that server's bindery context. (A bindery is a flat-file database. The NetWare 3 server that is upgraded is assigned a "bindery context", a particular container that will act like a bindery for applications that require a bindery.)
    • Other than these automatic replicas, you must create any other desired replicas manually. Subordinate references are created automatically, but they are not desired.
Identify Basic eDirectory Administrative Procedures

The chapter continues with a list of factors that will affect synchronization of replicas.

  • Number of objects - The more objects in a partition, the longer replicas will take to synchronize.
  • Number or replicas on a server - A server can hold only one replica of any partition, but it can hold replicas of as many different partitions as you have space to hold. Each replica a server holds will tax its resources a bit more.
  • Number of replicas in a ring - Novell recommends at least three replicas of each partition, but you can have more if the network is large, or has WAN links in it. The more replicas of a partition that exist, the more traffic and time it will take to synchronize.
  • Location of replicas - You may need to place replicas on both sides of WAN links, for the convenience of users. Traffic across WAN links is typically slower than inside LANs or MANs.
  • WAN links - Now that you are afraid of WAN links, the text mentions that speed across them varies.
  • Server speed - The processor speed of the server holding a replica will affect synchronization as well.
  • Cache size - A small cache means more small updates. A large cache means fewer updates.
  • Rate of change - Changes are not all synchronized at once. Some can wait. If many changes take place, or either type, they are placed in a queue, and synchronization is not achieved until they are all synchronized.
  • Relationships between objects - Related objects require updates when changes are made. For example, if new container objects are created, updates are required for users who have rights to the parent object of those containers.

A partition structure should resemble a pyramid for the same reasons that the eDirectory structure should. Consider WAN links and geography as important, since you do not want synchronization traffic over slow links.

Lower layer partitions should consider workgroups and resources. Put users in the same partition who need the same resources.

The next section of the chapter offers an update about the capacities of eDirectory:

  • a tree may have an unlimited number of objects
  • a partition should have fewer than 500,000 objects
  • parent partitions may have up to 150 child partitions

The text discusses seven goals of good replica placement:

  • Meet workgroup needs - if workgroups are mobile, this is more complex. In an example, users need access to resources in replicas that are across a WAN link. Having a replica stored on each side of the WAN link will actually decrease traffic, in this case.
  • Create fault tolerance - the minimum for fault tolerance is three replicas of each partition, and as many as ten depending on complexity of the network.
  • Regulate the number of replicas - no more than 50 replicas of each partition, no more than 250 replicas per server. In most networks, you will not have a need for nearly that many replicas of a partition, and most servers will not have the capacity to handle anything like 250 replicas. Six replicas of a partition is a more reasonable number.
    As a rule of thumb, assume that each object in a replica takes up 5 KB of hard drive space. Replicas are saved in hidden directories, so don't assume you will see them in the file system.
  • Replicate the [Root] partition - it is noted as the most important partition in the tree
  • Set Up Name Resolution - place replicas of all partitions users need on a single server close to those users.
  • Replicate for administration - place master replicas on servers near the network administrator who is in charge of partition changes. Replicas can be large, so don't copy them when the network is busy.

The heart of the chapter is summed up in the following problem. You must be able to look at a diagram of network partitions, and a table of replica placements, and then detect the errors in either one. Also, you must be able to apply the principles of this chapter to fill in missing entries in the table, either replicas that should exist or subordinate references that will automatically be created.

 
Servers
Partitions
  [Root] Comp Loc 1 Loc 2 Loc 3 Dpt 1 Dpt 2 Dpt 3 Dpt 4 Dpt 5
S1     Master   Master Master        
S2 R/W   R/W              
S3 Master R/O   Master R/W     Master R/W  
S4   Master   R/W           Master
S5                 Master  
S6 R/W S/R         Master      

In the table above, I have left out several Subordinate References (usually abbreviated S/R). I am only showing one, for the Comp partition on Server 6. There are also several missing replicas that should be there. Review this material, and see if you can tell what is missing. (To check yourself, mouse over the cells in the table to see which ones should hold S/Rs. Cells that should hold an S/R should dynamically change on this page.)

The chapter continues with a discussion of partition operations. Some tasks that can be done with partitions:

  • Create a new partition - Unless you are creating a new tree, a new partition will be a child of an existing partition. Assuming that you have replicas of the parent on servers, NetWare will automatically create a Master replica, and up to two Read/Write replicas of the new partition, placing the replicas on the servers that hold the replicas of its parent.
  • Merge partitions - If you want to remove a partition from the tree, you merge it with its parent. Only parents and children can be merged.
  • Replicate a partition - As noted above, NetWare will create a Master and two Read/Write replicas of a partition automatically, but only when new servers are added to the partition. You can use ConsoleOne to create new replicas, placing them on the servers of your choice.
  • Move a partition - Moving a partition really means moving the container that is the partition root to a new place in the tree. You cannot move a partition if it has child partitions, so a merge may be required before a move can take place.
  • Delete a replica - Specific replicas can be deleted. It is recommended to remove replicas from a server if the server is to be removed from the network. You cannot delete the Master replica of a partition, but you can make another replica the Master, demote the old Master to a Read/Write, and remove the Read/Write. Alternatively, you can merge a partition into its parent, which would remove all replicas of the now nonexistent child.

Backing up eDirectory can be done with whatever software you use to create backups of the NetWare file system. Novell includes backup and restore software with NetWare.

Although it is recommended to make frequent backups, the text cautions that your first line of support for eDirectory is found in the multiple replicas of each partition. It is better to rely on another replica than on a tape, since synchronization of replicas will most likely take place more often than making backups.

As I noted above, replicas are saved in hidden directories. The exist on the SYS volume on a server. Not a surprise, really, since that is the only volume Novell can predict will exist on every server. Remember that if the SYS volume runs out of space, the server crashes. Some guidelines are offered about keeping the SYS volume healthy:

  • Set the system to warn when SYS is running low on space. By default this happens when the server has 256 free blocks on the hard drive.
  • Don't put print queues on the SYS volume. Make other volumes for queues, if needed.
  • NDPS print systems use spools, not queues. Don't let spooling take place on SYS.
  • If users are allocated file space, place their space on another volume.
  • Don't put replicas on servers that do not have generous space on them.
  • You can set an attribute for file system directories to Purge Immediately. This will cause deleted files to become purged immediately after they are deleted. Don't do this without considering the implications of not being able to recover files.

More advice is offered about taking down servers that hold replicas. In general, remove replicas from servers that will be down for extended periods. Before removing a server permanently, move the replicas off it, then remove eDirectory from it.

The last topic in this rather long objective is detecting and resolving eDirectory inconsistencies.

Identify the Problem. - Three types of problems are listed:

  • Client Symptoms - these symptoms indicate replicas out of synchronization
    • User is prompted for a password when they have none.
    • Logging in takes an unusually long time.
    • eDirectory changes disappear.
    • eDirectory rights that were granted disappear.
    • Errors continue, but are erratic in appearance.
  • Unknown Objects - This may indicate a synchronization error, or it may indicate that partitions are being merged or created. It is also possible that you have multiple versions of eDirectory in your network.
  • eDirectory Error Messages
    • You can watch for error messages associated with applications. NetWare error numbers may appear in application message windows as decimal or hex numbers.
Design and Implement a Time Synchronization Strategy

It is very important for devices on a network to agree about the time at which events take place. When two or more people work on the same file, "he who saves last, saves best". Similarly, eDirectory changes that take place in multiple replicas must reconcile with each other, so there must be an indisputable method of determining the actual order of events. Servers determine the time on a network. The text lists four services that use server time:

  • eDirectory uses it to track changes to the database
  • File systems use it to track changes to files and directories
  • Messaging applications use it to track when messages are sent, received, and opened
  • Network applications also use it to track file changes

Every action on a network involving eDirectory is given a time stamp. The time stamp marks when the event happened, so that eDirectory can reconcile its partition replicas in the proper order. The time system used in NetWare is called Universal Time Coordinated (UTC) which is based on Greenwich Mean Time. The Prime Meridian (Longitude 0) passes through the Royal Observatory at Greenwich, England. The time at Greenwich has long been a base for determining the time elsewhere in the world, so this is a logical extension of historical precedent.

UTC is calculated by taking the local time, adding or subtracting the number of time zones away from Greenwich (add if you are west of Greenwich, subtract if you are east of it), and subtracting an offset for daylight saving time (if appropriate, that is, if you are using daylight time now). The Eastern Time Zone of the United States, for example, is five hours west of Greenwich, so we would take the local time and add five, then subtract 1 if daylight saving time is in effect, to obtain UTC time.

Set Up Time Synchronization

All NetWare (4.x or later) servers are time servers. Of course, it is not that simple: there are several types. Some types adjust their clocks based on time information from other servers, while other types are set in positions of authority in a hierarchy and they do not change their clocks (except when corrected from a more authoritative source). More on that below. To set up time synchronization for your network, you must first determine which condition below describes your network:

  • pure IPX
  • pure IP, or mixed IP/IPX

If your network is IPX only, you need to understand three concepts:

  • types of time servers used in IPX
    • Single Reference - part of the default configuration. The first server in the tree becomes the Single Reference server which provides time to all other servers, which are Secondaries. The Single Reference server does not adjust its clock for other servers, but it can get its time from the internal DOS clock or an outside source like the Internet.
    • Reference - at the top of a hierarchy of a time provider group, the Reference server consults with Primary servers (see below). It can get its time from the Internet (or other external source) or an internal clock. It does not adjust its own clock if the Primary servers (next layer down in the hierarchy) disagree about the time.
    • Primary - in the middle of the time provider group hierarchy. These servers poll their Reference server and other Primaries about the time. In cases of disagreement, votes are cast: each Primary server gets 1 vote, and each Reference server gets 16 votes. This results in a calculated weighted average that can be called the network time. The Primary server then adjusts its time by half the difference at each poll, but ONLY if the Reference server is AHEAD of it. Time is never adjusted backward. If the Primary server is fast, it will slow its clock until the network time catches up to it.
    • Secondary - at the bottom of both kinds of hierarchies, this server receives time from other servers, provides it to other secondaries and clients, never votes, and always adjusts by 100% if it is behind the correct time. As above, time is never adjusted backward. That would lead to inconsistencies with actions already taken on the server.
  • Configuring TIMESYNC.CFG - This is a text file holding time settings for a server which affect how the TIMESYNC.NLM program runs. (The settings in the configuration file make more sense once you understand the whole chapter.) This file tells the server how often polling takes place, what kind of time server it is, whether to use SAP for time services, whether to use its own hardware clock and so on. The settings in this file can be changed with a text editor, by using SET commands at the console, or by using the MONITOR.NLM program to change server settings.
  • Understanding how servers negotiate for time - servers can use Service Advertising Protocol (SAP) to request or to provide the network time, or they can be place into configured lists (time provider groups). SAP is the default. It is easy to use, but it generates extra net traffic. Configured lists create less traffic, but the administrator must configure the TIMESYNC.CFG file for each server in a group every time a server is added to or deleted from a time provider group. For fault tolerance, use configured lists and turn SAP on as well. This allows a server to find another time source if all servers in its configured list are down.

Time synchronization in an IP or mixed IP/IPX environment is described next:

  • TIMESYNC.NLM is still used, but so is NTP.NLM, the Network Time Protocol module. This protocol allows servers to obtain time from Internet sources. One server runs the NTP protocol, checks the Internet source, and then provides time to the other servers, whether the others are IP or IPX servers.
  • The NTP protocol has the ability to reject a time source if it is more than 1000 seconds different from the local clock. The protocol labels a source that is more than 1000 seconds different as insane.
  • The NTP.NLM program reads the NTP.CFG file for its settings. Two types of time sources are used in this file: servers and peers. A source listed as a server in the NTP.CFG file does not adjust its time to the local time. A source listed as a peer may adjust its time. Do not use an Internet time source as a server unless it is known to be reliable.
  • The default setting is for the local clock to act as a server. 127.127.1.0 is the IP address for a computer's local clock. This is somewhat practical if you have no Internet connection. Other addresses and URLs for Internet time sources are given in the text. When I pinged the IP addresses given, I got no responses. When I pinged the URLs below, they responded. This illustrates the fact that hard coding addresses is not the best method, unless they are under your control. URLs do not change as often as addresses, and DNS service will usually find the correct address for a URL. Two good URLs:
    • server ntp2.usno.navy.mil
    • server clock.llnl.gov

    The word "server" is used above to illustrate how the entry might appear in an NTP.CFG file.

Plan a Time Synchronization Strategy for a Network

First, benefits of using a Single Reference server:

  • It is simple. In fact, it is automatic. The administrator doesn't need to do anything extra to set it up.
  • No tweaking of configuration files.
  • When new servers are added, no changes are necessary. The Single Reference is still the Single Reference, and everything else is a Secondary.

A list of "considerations" for the Single Reference method:

  • All other servers must communicate with the Single Reference server
  • If the Single Reference server is bad, it affects every other server
  • Secondary servers are not picky: another server could claim to be the Single Reference and change their network time
  • Single point of failure
  • SAP traffic is not a good idea over WAN links

Recommendations about time provider groups:

  • Use a pyramid hierarchy - secondaries synchronize to primaries, primaries synchronize to reference. It is allowed for some secondary servers to synchronize with other secondary servers. It is recommended that you do not do so.
  • Use no more than seven reference and primary time servers (total) in a single time provider group
  • Time providers should be on reliable network routes

Benefits of using time provider groups:

  • Administrator has more control
  • Administrator can optimize related traffic
  • Provides specific alternative time providers

Some "considerations" for time provider groups:

  • Takes careful planning
  • Adding or removing a time provider means changing lots of files
  • You must designate at least one Reference and at least two Primaries. No more than seven Primaries in s single group.
  • Servers that poll each other must have reliable routes to do so
  • If using more than one Reference server, all Reference servers must synchronize with the same external time source

If your network has multiple locations (spans WANs), a separate time provider group is recommended for each site.

Primary time servers provide fault tolerance in case the Reference server is not available.