Advanced Novell Network Management: NetWare 6

Chapter 4: Monitor and Troubleshoot eDirectory

 

Objectives:

This chapter discusses troubleshooting and resolving server problems. The objectives important to this chapter are:

  1. Identify eDirectory Databases and Processes
  2. Identify eDirectory Troubleshooting Steps
  3. Identify Partition and Replication Placement Design
  4. Use iMonitor Reports to Obtain Server and eDirectory Information
  5. Perform Health Checks
Concepts:
Identify eDirectory Databases and Processes

The text reminds us that eDirectory is a database, typically divided into partitions, and stored in multiple replicas throughout your network. It makes sense to learn some methods to keep it running smoothly.

The text describes the component files of the eDirectory database:

  • NDS.DB - This is a roll back file, used to back out of transactions that have completed.
  • NDS*.LOG - This is a roll forward file, used to complete transactions that have not been completed.
  • NDS.01 - This is an actual database file, which can hold up to 2 GB of information. When your eDirectory information grows larger than 2 GB, NDS.02 is created. More files are created as eDirectory grows.
  • hex.NDS - The first few characters of this series of files are hexadecimal numbers. They are streams files, and they hold login scripts and print job configurations.

Three kinds of synchronizations that affect eDirectory are described:

  • time synchronization - Changes made in eDirectory are time stamped, which makes it important to have servers agree on the time in their network.

    When events are stamped with a date that is ahead of the current network time, those dates must be saved with a "synthetic time" date. When a future time is stamped on an object, eDirectory switches to synthetic time for objects in that partition. Any change to an object will be stamped with the same time, which is frozen in place until network time catches up.

    Sequence of events is still maintained by appending a sequence number to each time stamp: each new event gets the same time, plus a number that increments from 0 to FFFF (hexadecimal). If more than 65, 536 separate events occur before time catches up, the partition clock is allowed to tick for one second, then the sequence numbers start at 0 again.
  • schema synchronization - The schema is like a blueprint for eDirectory. It describes all the types of objects that can exist in your tree, and all their attributes. If the tree did not have a single schema, some objects would appear as impossible to some Directory Agents. The command SET DSTRACE=+SCHEMA will force a schema synchronization to take place. This process occurs every four hours by default.
  • replica synchronization - Replicas of partitions exist because of the need for redundant copies of data. All copies of a partition should be identical at all times. This is, of course, impossible if the replicas cannot or do not communicate with each other. When all replicas of a partition match, they are synchronized.

Some issues regarding eDirectory replicas and schema are discussed:

Novell recommends that you have 3 to 5 replicas of each partition for fault tolerance. More may be needed in a large tree, but the more replicas that exist of a partition, the more network traffic it will take to synchronize those replicas. Make as many replicas as you need, in order to place a replica near users who need one.

Unknown objects may appear in a tree in several circumstances. They come in two types: a question mark in a circle will appear next to an object that eDirectory does not recognize, and a question mark in a square will appear next to an object that eDirectory has no tools to manage. These unknown objects can appear if you have not installed snap-ins for a new object type, if the object has attributes not defined in your schema, if a required attribute is missing from the object, if the object is located in a container that the schema does not allow to hold it, or if the object has no name. These possibilities may seem unlikely, but they are possible if you import an object into your tree from a newer schema, if you merge trees without synchronizing the schemas, or if you restore a tree without updating the schema. The text explains that in older versions of eDirectory, it was not possible to modify the class of an object (its design template). Since you can do so now, it is more possible to import ain object from a tree whose schema was modified in this way.

Identify eDirectory Troubleshooting Steps

The basic advice Novell gives about eDirectory problems is to leave them alone for a while. Errors due to synchronization problems can be corrected when synchronization occurs, which may take a few minutes or several hours, depending on the amount of information to synchronize, the bandwidth available, and the processing power available.

When problems do not correct themselves, there are some steps to follow.

  1. Identify the Scope of the Problem - You can do this by noting the symptoms, error messages, and identifying the servers that hold replicas of the partition the error has occurred in. If the error is located on one server, check four log files on it: Server Personal Log (SYS:/SYSTEM/NRMUSERS.LOG), System Error Log (SYS:SYSTEM/SYS$LOG.ERR), Abend Log (SYS:SYSTEM/ABEND.LOG), and Server Health Log (SYS:SYSTEM/HEALTH.LOG). Remote Manager will provide access to the log files.
  2. Determine the Cause of the Problem - The text suggests that this may be difficult, since an error message may only be a symptom of a larger problem. Their best advice is to look for advice on the symptoms you have noted. Look online at the Novell web site, and other sites you may know.
  3. List Possible Solutions - Solutions may be found in your documentation, in the online Knowledgebase, or at Cool Solutions. Make sure you know what effects a possible solution will have, such as losing all eDirectory data by reinstalling eDirectory.
  4. Assess Possible Solutions - Decide which solution or solutions to try, based on likelihood of success, your experience, and the advice of others.
  5. Implement a Solution - Make backups, try your chosen solutions, and give it time to synchronize.
  6. Verify Resolution - Run your diagnostics again, looking for signs that the problem is resolved or still in place.
  7. Document the Resolution - This step is often overlooked. It is important to document problems you have had and their resolutions. It could become necessary to undo a resolution, which would be difficult if undocumented.
  8. Avoid Repeating the Problem - This implies learning from the experience, and taking action to avoid the things that caused the problem.
Identify Partition and Replication Placement Design

This objective is not really addressed in the chapter. The text points out that a separate partition has been made for each of six cities, but the Master replica for each has been placed on a server located elsewhere. This is probably not the best design, so the chapter walks you through creating new replicas of those partition. You create a Read/Write replica, placing it on a server in the city in question, then make that new replica the Master instead of the one in the wrong city.

Use iMonitor Reports to Obtain Server and eDirectory Information

This objective addresses reports available to the administrator from iMonitor. Two report options are presented on its menus: Reports and Report Config. They are not named very well. Reports takes you to a list of the reports that have been run already. Report Config actually takes you to a list of reports that can be run. The features of some reports are listed:

  • Server Information - This one takes a look at all servers in your tree. It can help diagnose problems with time synchronization or limber, a process that updates pointers to replicas when a server name or address is changed.
  • Obituary Listing - Lists all obituaries on a server. An obituary is a record of an object that has been deleted from one replica, but may not have been removed from others yet.
  • Object Statistics - More like a search report, it can be used to find unknown objects, renamed objects, and other possible trouble points.
  • Service Advertising - Your network may be using SAP, SLP, or both. This report returns a list of all directories and servers that your server has learned about by these protocols.
  • Agent Health - An agent is a server that is running eDirectory. This reports the health of your server.
  • Custom Report - You can customize a report or schedule one to run at a later date. Note that although you can schedule a report to run, it will only gather the information available to the Public trustee. If a report requires higher rights or security, you must run it yourself, not schedule it.
Perform Health Checks

The chapter continues with a list of things to check that contribute to the general health of eDirectory.

  • eDirectory version/revision - If not current, update to the current version with downloadable patches from Novell.
  • Time synchronization - As already noted in this text, time synchronization is vital to synchronizing updates to eDirectory objects.
  • Partition continuity - This is a feature of ConsoleOne which can tell you if all replicas of a partition are identical.
  • Background processes - some processes in eDirectory are meant to run in the background. Three are listed that bear attention. Schema synchronization status can tell you if your schema is understood by all eDirectory Agents. Obituaries have been discussed before, but this time we are told that there are eleven different obituary types, including moves and renames, in addition to deletes. External references are placeholders or pointers to information that has been requested, but is not stored on the server receiving the request.
  • Limber status - as noted above, limber is a process that updates eDirectory when server information is changed. The limber process can be started manually from iMonitor or using DSTRACE.

iMonitor can be used to check the above health indicators. When errors appear in iMonitor, you can click the error numbers to see proposed solutions.

Sometimes a solution requires you to initiate an eDirectory process. If so, you can monitor what happens with Trace, from inside iMonitor. (This is equivalent to running DSTRACE from a command line.) Watch for lines that read "All Processed=Yes". These are success messages for eDirectory processes.

Each time Trace is run, its results are stored in log files, accessible from iMonitor. These files can be kept or deleted as needed.

If you need to repair eDirectory, you can either run DSREPAIR at the server console, or run Directory Service Repair from iMonitor (on the server you are monitoring). Be aware that you cannot do both at the same time. Results of the Repair function are stored in REPAIR.HTM, which is, of course, viewable with your browser.