Troubleshooting Theory
Understanding the theory behind troubleshooting GroupWise issues should provide you with a foundation for identifying a problem. After you identify problems, you can resolve the majority of them by using the correct tools.
Troubleshooting theory focuses on identifying three types of information that you can use to identify problems:
- Error classes
- Architecture and component responsibilities
- Questions
The error classes are used to help isolate which component is at fault and where the problem is occurring. The architecture and component responsibilities are used to validate who is responsible for the error and narrow down possible problem areas. Finally, certain questions should be asked to ensure that the perceived problem is the real problem.
Error Classes
With GroupWise, many errors can and will be seen. These errors range from database-specific, to client-specific, to operating-system-specific errors. GroupWise errors are categorized into classes. The class is represented by the first two numbers or first letter, which indicate what the problem most likely is.
Error Class | Description |
82xx | File input/output (I/O) errors. These deal with file rights, drive specifications, mappings, and file location. |
89xx | TCP/IP communication errors. These indicate a problem with the TCP/IP communication, which is often caused by misconfigured agents, clients, or TCP/IP stacks on the server or workstation. |
Cxxx | Database structural problem. Most of these errors are corrected with a structural rebuild of the database. |
Dxxx | Database content problem. A Dxxx class error can be resolved with a content check and fix, provided the content is in a state that can be fixed. Another alternative for correcting content problems is to restore the database from a backup. |
Exxx | Document Management Service (DMS) errors. This class of errors is specific to document management. However, many DMS databases and Binary Large OBjects (BLOBs) could be the source of the problem. Likewise, many databases contain both DMS and other data store information that could be the source of a problem. |
Fxxx | QuickFinder errors. The QuickFinder is the component responsible for indexing the GroupWise data store. It could point to either mailbox indexes or to the DMS indexes. |
By itself, the error class rarely leads to the real problem. However, it provides a starting point for troubleshooting any problem. Based on who, where, when, and how the error was received, the problem and resolution are usually close.
Architecture and Component Responsibility
You should memorize the GroupWise architecture. The directory structure, the database files, and the message flow are always consistent within a GroupWise system. Knowing the architecture provides a basic foundation that can often make the difference between troubleshooting the problem or guessing at the problem.
You also need to be familiar with the responsibilities, including reading and writing to specific databases, of the GroupWise components. When you understand the responsibilities, you can determine where a problem is, which can also lead you to a probable problem causer. In turn, you can provide a real solution rather than a quick fix that just hides the problem.
Understanding where each component fits in the system and what it does will greatly reduce the time you need to identify and resolve the issue. When you combine this information with the error code information, the real problem starts to become clearer.
Questions
Part of troubleshooting effectively is simply asking the right questions. At first asking questions can be time-consuming and tedious. However, after you learn which questions to ask, your troubleshooting and issue resolution skills will increase.
The five basic questions are who, what, where, when, and how.
Who? The answer to a who question is not always obvious. In fact, several who questions, such as the following, could be asked:
- Who reported the problem?
- Who saw the problem?
- Who is seeing the problem?
- Who has the problem but doesn't know about it?
Some of these questions won't be asked until other questions are asked, but they can be important to the resolution of the problem.
What? Several what questions, including the following, can also be asked:
- What is the problem? Most likely, the customer cannot identify the real problem. The customer can only tell what he or she knows. Make sure the problem is understood before making assumptions.
- What is happening around the problem?
- What has changed?
Where? The where question is one of the most important questions. It can identify a trend or a location of the problem. The where questions can provide a branching factor when you are trying to eliminate possibilities. Some of these questions are as follows:
- Where is the problem occurring: In the client? From the same workstation? From a different user's mailbox?
- Where is the problem being reported?
When? This is another important question for eliminating possibilities.
- When does the problem occur: Only when sending? Only when starting the client? Only when a scheduled event takes place?
- When is it seen: Right away? Every time? Only when sending to that user?
How? The how questions are not always applicable, but they can be used in some situations. They can help you verify information. Here are some examples:
- How was the problem duplicated?
- How is it set up?
- How is it seen: The same every time? In the same place every time? With the same error every time?
Bringing It All Together
When you use all three of these techniques, you have a powerful tool for rooting out problems. They do not need to be used in a specific order. In fact, you should use the questioning throughout the entire process. Here is an example of this type of troubleshooting methodology.
Problem: I am getting a C04F error when I send a message to John.
Troubleshooting: C04F is a C xxx class error, meaning that it is a database structure problem. You know that when a user sends a message, information is written to the user database, the message database, and a BLOB file if there is an attachment or if the message or distribution list is larger than 2 KB. You also know that an outbound message is always written to the same message database.
Conclusion: Based on this information, the answers to a few other questions, and the configuration of the post office, you can isolate the problem to the message or user database. In this example, running GWCheck with an Analyze and Fix on the structure would correct the problem.
Knowing what the error classes indicate, understanding the architecture and functions of GroupWise, and asking the right questions will most often lead to the source of a problem. The solutions to the problems can range from database check utilities, to product patches, to bug reports.
General GroupWise Troubleshooting Strategies
This section presents strategies for troubleshooting general GroupWise- related problems. The online GroupWise administration guides referred to can be accessed on the Web at:
If you experience a problem that these troubleshooting strategies do not solve, see the Novell Support Connection Web sites at:
Look Up Error Messages and Codes
Problem: You receive an error message in GroupWise.
Action(s): Look up the error code in the list of GroupWise error messages in Book 1: Error Messages in the online GroupWise 5.5 Troubleshooting Guide. You can use the Find feature on your browser's Edit menu to locate a GroupWise error code or message text in the list. If the error you received is not in the list provided, standard solutions are not yet available.
Check GroupWise Agent Logs
Problem: You are experiencing message delivery problems.
Action(s): Use the MTA and POA log files to help you track message delivery problems. Set the logging level to Verbose so all processing information is displayed. This information can help you determine what the problem is. You should also verify the information is being logged to disk.
There are several ways to set the logging level to Verbose: in NWAdmin, with startup switches, or from the agent operation screen.
In NWAdmin. You can configure default log settings for each agent in the Log Settings page for each agent object.
- In the NWAdmin browser window, double-click the NDS container where the GroupWise post office is located (if configuring logging for the POA) or the container where the GroupWise domain is located (if configuring logging for the MTA).
- Double-click the post office object or domain object.
- Right-click the agent object and select Details | Log Settings. The Log Settings dialog box displays (see Figure 1).
Figure 1: Log Settings dialog box in NWAdmin.
Set the desired settings for the log files, including "Verbose" for Logging Level.
With Startup Switches. Log settings provided in the agent startup file override default settings made in NWAdmin. The following startup switches are available to configure logging for each agent:
- /log=<path<. Specifies the directory where the agent will store its log files. The default locations are <post<office<\WPCSOUT\OFS for POA log files and \MSLOCAL for MTA log files. Typically you would find multiple log files in the specified directory. The first four characters represent the date. The next three identify the agent. A three digit extension allows for multiple log files created on the same day. For example, a log file named 0518MTA.001 would indicate that it is an MTA log file, created on May 18. If you restarted the MTA on the same day, a new log file would be started, named 0518MTA.002.
- /loglevel=. Sets the amount of data displayed in the log message box and written to the agent log file during the current agent session. Possible settings are:
- Normal (default). Displays only the essential information suitable for a smoothly running agent.
- Verbose. Displays the essential information, plus additional information helpful for troubleshooting.
- Normal (default). Displays only the essential information suitable for a smoothly running agent.
- /logdiskoff. Turns disk logging off so no information about the functioning of the agent is stored on disk.
- /logdays=. Sets the number of days you want agent log files to remain on disk before being automatically deleted. The default log file age is 7 days.
- /logmax=. Sets the maximum amount of disk space for all of that agent's log files. When the specified disk space is consumed, the agent overwrites existing log files, starting with the oldest. The default is 1024 KB of disk space for all of that agent's log files.
From the Agent Operation Screen. You can adjust the agent log settings for the current session from the agent operation screen. This overrides any settings made in NWAdmin or in the agent startup file. The modified settings remain in effect until you restart the agent, at which time the log settings specified in NWAdmin or the startup file take effect again.
- At the server or workstation where the agent is running, display the agent operation screen.
- Select Log | Log Options. The Log Settings dialog box shown in Figure 2 is displayed.
Figure 2: Log Settings dialog in the agent operation screen.
- Adjust the values as needed for the current agent session.
Because the agents consist of multiple threads, you may find it useful to retrieve the log file into an editor and sort it on the information (thread ID or facility name) that follows the date and time information. Sorting will group all messages together for the same thread.
For more information, see "Agent Log Files" in Chapter 6: Message Flow Monitoring in Book 1: Agents' Roles in Message Flow in the online Agent Configuration Guide.
Assign GroupWise Administrator
Problem: No one is receiving error messages generated by the MTA and the POA.
Action(s): Make sure each domain has an administrator who receives error messages generated by the agents. If you want to be notified with an e-mail message whenever an agent encounters a critical error, you can designate yourself as an administrator of the domain where the agents are running.
- In the NWAdmin browser window, double-click the NDS container where the GroupWise domain is located.
- Double-click the domain object and select Details | Information to display the Information page (see Figure 3).
Figure 3: Information page for a GroupWise domain.
- In the Administrator field, browse to select your GroupWise user ID. (A domain can have a single administrator, or you can create a group to function as administrators.)
- Click OK to save the administrator information.
The selected user or group will then begin receiving e-mail messages whenever an agent encounters a critical error.
Check Network Access for Mapped or UNC Links
Problem: Insufficient network rights can cause problems for the GroupWise client and agents when direct access to databases and directories is required. The following can be signs of insufficient network rights:
- Access-denied errors
- Messages are not being delivered
- One or more users are unable to start GroupWise client
Action(s): Check the rights in the post office where the problem is occurring. For lists of necessary rights, see Chapter 2: GroupWise Agent Rights and Chapter 3: GroupWise User Rights in the online Security Guide.
You can set the proper user rights for all users in a post office or for an individual user in NWAdmin. In the browser window, click a post office or user object and select Tools | GroupWise Utilities | Set Rights.
Check IP Addresses and Port Numbers for TCP/IP Links
Problem: Incorrect IP addresses and port numbers can cause problems for the GroupWise client and agents when TCP/IP connections are used.
Action(s): Take the actions indicated below to solve possible problems.
POA IP Address and Port Number Available to Client. Make sure the GroupWise client is set up with the correct IP address and port number for the POA in each user's post office. Or make sure the GroupWise name server is running so IP addresses and port numbers can be looked up automatically. For more information, see Chapter 4: Set Up GroupWise Name Server in Book 2: Post Office Agent Configuration in the online Agent Configuration Guide.
POA IP Address and Port Number Set Correctly. Make sure the POA is set up with the correct IP address and port number. For more information, see "Use Client/Server Access to Post Office" in Chapter 3: Reconfigure the POA in Book 2: Post Office Agent Configuration in the online Agent Configuration Guide.
MTA IP Address and Port Number Set Correctly. Make sure the MTA is set up with the correct IP address and port number. For more information, see "Use TCP/IP Link between Domains" in Chapter 3: Reconfigure the MTA in Book 3: Message Transfer Agent Configuration in the online Agent Configuration Guide.
Duplicate IP Address or Port Number. Make sure no duplicate IP addresses and port numbers are in use.
Check Available Disk Space
Problem: When one of the GroupWise programs tries to create or modify a file and there is not enough disk space available to complete the task, you will generally get a disk full error message.
Action(s): Free up space on the disk. For help, see Chapter 3: Manage Disk Space in Book 5: Maintain GroupWise Databases in the online Maintenance Guide.
Recover and Rebuild Databases
Problem: Problems with a domain database (WPDOMAIN.DB) or post office database (WPHOST.DB) can cause access problems for GroupWise Administrator and the GroupWise client, as well as cause message delivery problems. Database problems can also cause information in your system to be out of sync (for example, a user's information in one post office being different than that user's information in another post office). User and message database problems can cause users to lose access to their mailboxes, have incorrectly displayed mailbox information, and have message delivery problems.
Action(s): For database recovery and rebuild procedures, see Chapter 5: Repair GroupWise Databases and Indexes and Chapter 7: Use Standalone GroupWise Check in Book 5: Maintain GroupWise Databases in the online Maintenance Guide.
Verify GroupWise System Information
Problem: Messages are not being delivered to post offices or domains, or you are receiving excessive undeliverable messages.
Action(s): Take the appropriate action according to whether you have configuration problems or synchronization problems.
Configuration Problems. Configuration problems with your GroupWise system can cause message delivery problems. Check the following areas:
- Links between domains. See "Link Management Tasks" in Chapter 3: Message Routing between Domains in Book 1: Agents' Roles in Message Flow in the Agent Configuration Guide.
- Links to post offices. See "Links between the Domain and Its Post Offices" in Chapter 2: Message Transfer between Post Offices in the Same Domain in Book 1: Agents' Roles in Message Flow in the Agent Configuration Guide.
- Domain and post office properties. See Book 1: Add a Post Office and Book 2: Add a Domain in the online Post Office and Domain Setup Guide.
- Agent properties. See Chapter 6: POA Configuration Options in Book 2: Post Office Agent Configuration and "Chapter 4: MTA Configuration Options" in Book 3: Message Transfer Agent Properties in the online Agent Configuration Guide.
Synchronization Problems. If information is correct in some places but incorrect in others, you can synchronize GroupWise information. See Chapter 4: Synchronize GroupWise Domains and Objects in Book 5: Maintain GroupWise Databases in the online Maintenance Guide.
Understand Message Flow
Problem: You are experiencing message delivery problems
Action(s): Because each component (GroupWise client, MTA and POA) is responsible for a specific area of message flow, knowing where the message flow has been interrupted can help you discover which component is not functioning correctly. See Book 3: Message Flow Diagrams in the online Troubleshooting Guide.
Check File Ownership
Problem: The following information applies only if your network operating system is NetWare and you are using mapped drive connections to post offices. NetWare assigns ownership of a file to the creator of the file. Therefore, when a GroupWise program (client, agent, or gateway) creates a database file, the ownership of that database file is assigned according to the network ID currently being used.
For example, if Eric is using the GroupWise client program and it creates a new message database, Eric is assigned ownership of the message database. Problems can arise if a user who owns a database file is removed from the network. Problems can also arise if the owner has limited available disk space.
Action(s): If a database file cannot be accessed, you should check to see who owns the database and, if necessary, reassign ownership to a valid network account (preferably Admin or SUPERVISOR). To look up the locations of databases in domains and post offices
To check ownership, you can change to the directory and use the NDIR command. To reassign ownership, you can use the FILER utility.
No comments:
Post a Comment