Hitachi ID Systems, Inc.

Hitachi

White Papers Privileged Password Management Best Practices Best Practices for Managing Privileged Passwords
Hitachi ID Systems Web Feeds Follow Us on Twitter Follow us on LinkedIn
certification

Product Sites

Best Practices for Managing Privileged Passwords


Introduction

This document describes the business problems which privileged password management systems are intended to address. It goes on to describe best practices for defining and enforcing policies regarding discovering systems on which to manage passwords, managing passwords for privileged accounts, securely storing privileged passwords, disclosing passwords and activating privileged login sessions and more.

Privileged accounts include administrator accounts, embedded accounts used by one system to connect to another and accounts used to run service programs.

Hitachi ID Privileged Password Manager is a security product that enables organizations to define and enforce policies regarding access to privileged accounts on a variety of systems. It enables organizations to control what users and programs have access to what privileged accounts, to control when such access is allowed and how access is activated and deactivated. Access rights may be permanent or temporary. All access is logged and subject to audit reports.

The bulk of this document is general in nature -- applying equally to any privileged password management system. A few sections are specific to Privileged Password Manager and are identified as such.

Look for the figure marks throughout this document to find best practices.


Risk management

Baseline risks

Organizations face significant security exposure in the course of routine IT operations. For example, dozens of system administrators may share passwords for privileged accounts on thousands of devices. When system administrators move on, the passwords they used during their work often remain unchanged, leaving organizations vulnerable to attack by former employees and contractors.

Other security problems related to administrator accounts include:

Some organizations manage the most secure passwords by periodically changing them, writing them down and literally storing them in a safe. This approach sounds secure but it creates its own set of business risks:

In addition to privileged accounts used by IT system administrators, there are also privileged accounts used by one application to connect to another. For example, many web applications use a login ID and password to connect to databases, directories or web services. These accounts may have their own security risks:

Finally, unattended processes on Windows systems also run with a login ID and password. This includes service accounts, scheduled tasks, anonymous access to web content and more. Many applications only work when these services have elevated privileges. This also creates business risk:

In each of the above cases, the risks can be summarized as:

Securing privileged passwords

A privileged password management system works to mitigate the baseline risks identified in the previous section:

New risks once automation manages privileged passwords

Once a privileged password management system is deployed, baseline risks are addressed but new risks must be considered:

Protecting the privileged password management system


Server infrastructure

This section describes how a privileged password management system should be configured, to support the business objectives of high scalability and high availability.

Server numbers and placement

Fundamentally, a privileged password management system should always be deployed with at least two servers, preferably located in two different physical sites. This arrangement prevents system failure due to component failure:

Multiple servers should carry on near-real-time data replication. As the number of servers grows, so too will the volume of replication traffic between them. Since servers should be at different physical sites, the database replication traffic will be carried over a wide area network. The net result of all this is that while having at least two servers is essential, having too many servers will significant reduce overall system performance. Consequently, it is recommended that organizations deploy no more than three replicated privileged password management servers.

The next question is where the servers should be placed on the network. While it is impossible to answer this question in the general sense -- no two network topologies are alike -- it is possible to offer some general guidance:

Database type and placement

Whereas the preceding guidance applies to any privileged password management system, the following guidance is specific to Privileged Password Manager:

Use of firewalls

A privileged password management system contains and controls disclosure of very sensitive information. This naturally raises the question of whether and how firewalls could be leveraged to increase the security of the system itself.

It is reasonable to place firewalls between end users and the privileged password management system. Assuming that the user interface runs on HTTPS on port TCP/IP 443, it is straightforward to limit inbound connections from users to just port 443.

Moreover, an application-level firewall (configured as a reverse web proxy) could:

A firewall between a privileged password management system and devices on which passwords are being managed can also be used. Since every type of target system may use a different protocol, the configuration of a firewall in this location could allow any connection initiated by the privileged password management system but block any connection initiated in the other direction.

Finally, a firewall may be considered between the privileged password management system and its internal database. This is actually not recommended:

An optimal firewall configuration to protect multiple PPM servers is illustrated in Figure [link].

[ppm-firewall-1]
Figure: ppm-firewall-1
Figure 4: Protecting PPM servers with firewalls

Virtual vs. physical servers

The following guidance is specific to Privileged Password Manager:

Impact on network and storage

Whenever Privileged Password Manager randomizes a password, the resulting database records (current password value, password history, log events, etc.) consume 5.1 kBytes each on Microsoft SQL Server and 3.1 kBytes each on Oracle Database.

Using the round number of 6 kBytes/password change, this means that an organization wishing to manage 5,000 privileged passwords; change passwords daily and retain archival password data for 3 years will require a database with:

Assuming some overhead for workflow requests, system configuration, etc. (i.e., double the space requirement to be safe), it is reasonable to configure the above system with about 60 GBytes of disk, regardless of which database type is used.

It should be noted that the Express Editions of Microsoft SQL Server and Oracle Database are both limited to 2 GByte databases, which underscores the need to deploy with a standard or enterprise edition of either database in a production deployment, rather than the Express Edition (which is only suitable for test, QA, etc.).

Similarly, the amount of traffic that Privileged Password Manager transmits between servers to replicate the storage of changed passwords is just under 1 kByte/password. Using the same example organization, we can estimate about 5 MByte/day of replication traffic associated with password updates - though probably more than that will be needed to replicate workflow requests, login audit records, access disclosure, etc. -- adding up to about 100 MByte/day total replication traffic.

The above bandwidth is strictly to handle replication between servers. Additional bandwidth is consumed between Privileged Password Manager and target systems (to randomize passwords) and between end users and Privileged Password Manager (to request and acquire privileged access):

Total network impact can be estimated based on the rough metrics above, multiplied by the workload projected for the system.

Server sizing

The preceding discussion is helpful when deciding where to place Privileged Password Manager servers and databases. This leaves open the question of how to configure the hardware for each server. Following is a reasonable configuration, which attempts to balance performance with cost given component costs as of January, 2010:

A single server configured like this can reasonably manage at least 100,000 passwords on target systems every 24 hours while concurrently servicing at least 100 concurrent, interactive user sessions. User sessions into the Privileged Password Manager web UI typically only last about 2 minutes, so during an 8-hour work day this server could reasonably disclose privileged access 24,000 times.

In any deployment, at least two such servers should be deployed, with each server housing a complete database replica and the Privileged Password Manager software. The servers should be installed at different sites.

A virtual machine configuration should be configured with similar disk, I/O, CPU and memory capacity.

Load balancing and replication

Given that at least two privileged password management systems are deployed and assuming that both servers are active at all times, the next question is how to configure load balancing so that users can access both servers.

Load balancing can be accomplished using a variety of mechanisms, including:

[ppm-load-balance-1]
Figure: ppm-load-balance-1
Figure 5: Load balancing using multiple IPs on a single DNS name

[ppm-load-balance-2]
Figure: ppm-load-balance-2
Figure 6: Load balancing with a reverse web proxy

[ppm-load-balance-3]
Figure: ppm-load-balance-3
Figure 7: Load balancing by routing to different IP addresses

Each of these techniques will work. DNS may be preferable since it requires no special infrastructure and -- depending on the type of DNS server software used -- may be configurable so that the server IP address returned from each DNS query is chosen to be the server closest to the requesting user.

This is best illustrated with an example:

Regardless of the load balancing technology used, sessions from a given client to a given PPM server should be "sticky" in the sense that the same PPM server will be used throughout the session. This is important as it eliminates the need for multiple PPM servers to replicate session state date, so significantly lowers the need for bandwidth.


When and how to randomize privileged passwords

The entire premise of a privileged password management system is to secure privileged passwords by periodically scrambling them, so that current password values are not known to users or programs until and unless they are actually needed and that disclosure is authorized and logged.

This premise raises an obvious question: when should passwords be randomized and how should the random passwords be composed?


Securing access disclosure

Identification, authentication and authorization

A privileged password management system's job is not only to randomize and store privileged passwords but also to disclose those passwords to users, applications and service-running infrastructure. Otherwise, the privileged accounts whose passwords are being secured would become useless.

Disclosure must be controlled:

Following are some best practices for identification, authentication, authorization and audit:

Access disclosure mechanisms

A privileged password management system is normally configured to control access by people and programs to privileged accounts. The previous section covered authentication and authorization of users who wish to gain this access. This still leaves open the question of how disclosure is actually accomplished.

There are several approaches to disclosing access:

Concurrent disclosure (checkin/checkout)

Since a privileged password management system is able to control disclosure of access to privileged accounts, it is also in a position to control how many people can gain access to the same privileged account at the same time. This is useful for two reasons:

With this in mind, it's reasonable to promote some best practices:

With concurrency controls in place, a risk arises that one administrator will check out access to a privileged account, leave the session active and stop working (go home, leave for lunch, etc.). If another administrator needs access to the same system during the time interval when the first administrator's session is still active but the first administrator has left, then the system will be inaccessible. To mitigate this risk, it is important to set time limits on administrative sessions -- for example, a 1 hour default and a 4 hour maximum. This reduces the time window during which a system is unmanageable because of an unused but still open session.

A second consideration that relates to concurrency controls is how to enforce them in the event that a password was actually displayed to the user who gained access to a privileged account? The administrator in question will still have the password, even after the password checkout time interval has elapsed. To reliably end the administrator's session, it is important to, if possible:


Reporting on access disclosure

A fundamental capability of a privileged password management system is to create accountability for administrators who used shared, privileged accounts. This is done by (a) logging all access disclosure and (b) reporting on this disclosure.

Reports on privileged sessions should be run in two ways:

An important question is who should be allowed to run reports on access disclosure? Since the reports only indicate who had access to what, but not what they did, it seems reasonable to have a default policy that allows any IT user to report on the activity of any other. This "transparent" model encourages good behaviour since administrative sessions are "public knowledge" among IT staff. A transparent policy also supports troubleshooting, since if one administrator sees a configuration problem on a system, he can quickly determine who may have made the change in question and ask them why.

The only real exception to the transparent approach to reporting is if a small team of administrators needs to make changes that are so sensitive that other administrators should not know about them. For example, if mass layoffs are being planned, including layoffs of other administrators, it makes sense to keep this secret. Since this sort of scenario is quite rare, it still seems reasonable for most organizations to have transparent administration practices by default, and only change the policy under very unusual circumstances.

Another consideration is how long to retain records of access requests, privileged access sessions, reports that were run, etc. Since disk space is relatively inexpensive, it seems reasonable to archive at least several years' worth of data on-line.

Finally, in addition to allowing IT users to run reports (and see one-anothers' activity), IT security auditors and corporate risk officers should be empowered to run the same reports -- they should be able to see what IT staff are doing, without being able to gain access to systems themselves. In other words, the right to run reports should not be connected to the right to gain access to privileged accounts.


System monitoring and maintenance

Allocating staff to monitor and maintain the system

Between 1/4 and 1 full time equivalent position is required to effectively manage a production privileged password management system. The responsibilities of ongoing system management can be roughly broken down into two roles -- a project coordinator and a technical system administrator.

The responsibilities of the long-term privileged password management system project coordinator include:

The project coordinator's skills are basically competent IT project management.

The responsibilities of the privileged password management system's technical administrator include:

The technical system administrator's skills may include any of:

Monitoring system health

A production privileged password management system should be monitored, to ensure that it is operating correctly at all times.

This includes:

Platform monitoring is most effectively handled using a standard IT infrastructure monitoring system, such as HP OpenView or Microsoft Operations Manager.

Application monitoring is most effectively handled by configuring the privileged password management system itself to send e-mails or open support incidents when events of interest happen.

Security monitoring is most effectively handled by periodically running reports against log data and sending those reports to security officers.


Configuring target system integrations

A privileged password management system's value increases as the number of integrations grows. The security benefit is clearly greater if privileged passwords are secured on 1000 systems, as compared to 100.

As the number of integrated systems grows, the cost of adding, maintaining and removing integrations manually will also grow. Automation is needed to scale the system up to more than a few hundred integrations.

Automating the maintenance of integrations means automating several, distinct tasks:

In any medium-to-large organization, workstations and servers are activated and retired daily. It therefore seems reasonable to run any auto-discovery process every 24 hours.

There are several technological approaches to discovering servers. Choice of the appropriate method depends on available infrastructure:

Once a system has been discovered using one of the mechanisms described above, the next step is to -- initially and periodically -- get a list of login accounts from that system and determine which of them qualify as "privileged" -- because they are members of administrator-level groups, have a given numeric ID, are used to run services or scheduled tasks, etc.

The mechanism for enumerating IDs and qualifying them as privileged varies from system to system. For example, an SSH script that checks whether a given user has a UID of 0 or belongs to groups such as wheel, root or admin can be used on Unix or Linux systems, while a program that connects over RPC and checks group membership, Windows Service Manager configuration, Scheduler configuration and IIS configuration is appropriate for Windows systems.

The frequency of enumerating privileged IDs on discovered target systems should be high -- in order to detect IDs that were created in an unauthorized fashion. On the other hand, it should be low -- in order to reduce the run-time of the auto-discovery process and to minimize network impact In practice, a frequency of between once-daily and once-weekly is a reasonable compromise between these conflicting objectives.

Another question that arises when auto-discovering target systems is what credentials can/should be used when first connecting to each system. Reasonable options include:

Since a privileged password management system attempts to connect to integrated systems often -- to change passwords -- it can be used as a coarse-grained infrastructure monitoring facility, to raise alarms in the event that a target system is unreachable. Alerts can take the form of e-mails to administrators, tickets in a help desk system, etc.

If a target system is persistently unavailable, it should be automatically removed from the regular password rotation process. This helps keep the database clean as systems are moved or retired. It should be noted that historical password data for every system should be retained -- in the event that a system was offline for an extended period due to hardware problems or that it is later restored from backup media.

In any case, a report should be run regularly to identify non-responsive target systems. This will allow system administrators to match the list of non-responsive systems against a list of known-retired systems and to identify anomalies where the lists don't match.


Production migration

A privileged password management system is a very sensitive part of an organization's infrastructure. As such, it should first be deployed to a test environment and its configuration validated, before moving to production.

Once in production, it makes sense to phase use of the system in, to minimize risk due to configuration problems, software defects, etc.:


API Considerations

An API allows a privileged password management system to secure passwords that authenticate one application when it connects to another. For example, an e-commerce application may have to sign into a database server to read inventory data and post transactions. Such connections are normally authenticated using a login ID and password.

The security problem when one application uses a password to authenticate to another is that the password may be:

To eliminate this problem, a privileged password management system may be used to periodically scramble the embedded account's password. An API then allows each instance of the e-commerce application to fetch a current password value, with which it can connect to the database, from the privileged password management system. This eliminates static passwords and passwords in plaintext files but creates new challenges to overcome:

There are no "one size fits all" answers to these questions. Every organization will have its own priorities and every application will have its own constraints, leading to somewhat different answers. Following are some reasonable approaches to each of the above questions, presented with the understanding that they may or may not suit the needs of a given organization and application.


Summary

A privileged password management system enables organizations to replace well-known, static and insecure passwords with frequent password changes, strong and personal authentication, fine-grained authorization logic and extensive audit logs.

Deploying this sort of system can be invasive -- failure of the system, in terms of confidentiality, integrity or availability, would be catastrophic. Consequently, great care must be taken to deploy the system in a manner that is robust, fault-tolerant and secure.

This document outlined an exhaustive set of best practices intended to ensure that a privileged password management system is highly available, secure, scalable and efficient to manage.