NCCS | User Info | search  

File Systems on Falcon


Contents


Distributed File Service

Falcon uses the Distributed Computing Environment (DCE) for user authentication. With DCE, all user information is centralized in the "registry", so each domain of Falcon does not require a separate copy of each user's login information ("/etc/passwd"). Integrated with DCE is the Distributed File Service (DFS). DFS is not yet available natively, but it is accessible from all nodes of Falcon through NFS mounts to CFS. Because of the NFS intermediation, users must take special steps to write to DFS.

The first time you log in, and at regular (~monthly) intervals afterwards, you will need to issue the following command to create credentials on the NFS/DFS gateway server. Issuing this command on one node is sufficient for all of Falcon. You should not have to issue this command every time you log in (unless you only log in once a month).

$ dfs_login
DCE password for user: password
...

Because the DFS servers are outside Falcon, DFS does not provide the highest performance for large files. For fast large-file access on Falcon, see CFS, below.

All user home directories are kept in DFS, and each user has a default storage limit of 500 MB. In addition to the "yesterday" backup described below, home directories in DFS are copied to tape backup four times a week.

Some typical Unix commands, like "df", do not work in DFS. To find your quota and usage in DFS, use the following command on a system that supports native DFS (Eagle or Bearcat). This command will not yet work on Falcon.

$ fts lsquota ~
Fileset Name          Quota    Used  Used   Aggregate
us.user               500000  407726    81%    89% = 22865076/25574421
(LFS)

For more information on "fts" commands, see "man fts" on a system with native DFS (unlike Falcon). For "man" pages on "fts" subcommands, use an underscore ("_") between "fts" and the subcommand. Examples:

man fts
man fts_lsquota

Each home directory has a default set of subdirectories:

publicThis directory is world readable. Use it to make files available to other users of CCS systems.
privateThis directory is only accessible by the user. Because of the mechanism used to authenticate parallel processes, interactive or batch parallel jobs cannot be submitted from "private". If you want to run a parallel job in "private", you must submit the job outside of "private" and use paths back into it or "cd" into it from a run script.
yesterdayThis directory contains a read-only copy of all the rest of the home directory, including other subdirectories, as of the day before. The copy is generated at 4AM each morning. If you accidentally delete any of your DFS files, you can simply copy versions from the day before out of "yesterday".
binThis directory is a location for user-generated executables. It is not in your "PATH" by default, however. You can add it or one if its subdirectories to your "PATH" in your ".profile" or ".cshrc" file.
wwwIn the future, we plan to make documents kept in this directory available over the World-Wide Web.


Cluster File System

The Cluster File System (CFS) is a large temporary-storage area with shared access from all Falcon nodes. The CFS servers are nodes of Falcon, and data transfer goes exclusively over the Quadrics interconnect.

CFS is intended as work space for Falcon applications. CFS is not backed up, so you need to copy any important output from CFS to one of the other file systems for permanent storage. CFS areas may be purged to help ensure that adequate work space is available for new jobs. Files that have not been accessed for more than a week are considered eligible for purging.

CFS storage is divided into various areas, and the name of each directory indicates the size of its area. For example, "/cfs500a" is about 500GB. The character after the size differentiates multiple areas of the same size.

The directory "/cfs500a" has a subdirectory for each user. We recommend that you use this subdirectory unless you have extreme storage requirements. Other CFS areas are reserved for applications with such requirements.

For your convenience, the environment variable "$SYSTEM_USERDIR" is defined so that it points to your CFS area. You can use it in jobs and interactive sessions for easy access to your CFS area. This same environment variable is defined appropriately on other systems to point to their equivalent node-global filesystem, whatever it may be, so you can use this environment variable to improve the portability of your scripts within CCS.

Falcon is split into three CFS domains. The first few nodes are the fileserver domain and serve out the "/cfs*" files. The remaining domains represent the compute partition of Falcon; these nodes access the "/cfs*" directories in the fileserver domain through the SC File System (SCFS). SCFS just provides high-performance clients for the CFS servers; it is not a file system in and of itself.


Parallel File System

Like CFS, the Parallel File System (PFS) is a temporary-storage area with shared access from all Falcon nodes. PFS is striped over multiple CFS component systems, so the potential bandwidth to disk is greater. Therefore, PFS can provide higher performance for large, contiguous file transfers.

PFS should be considered an experimental file system. It is not yet as stable as CFS.

PFS areas are named like CFS areas; "/pfs150a" is roughly 150GB.


High-Performance Storage System

The High-Performance Storage System (HPSS) provides archival storage. It is "high performance" relative to other archival systems, not relative to native file systems like CFS. Large permanent files should be moved directly from CFS or PFS, presumably where they were created, to HPSS.

You access HPSS through the "hsi" interface, which is available on all Falcon nodes. Because it uses DCE authentication, "hsi" requires no password and can thus be used within batch scripts.

HPSS is unavailable during weekly maintenance, which currently occurs Wednesday mornings, typically 7AM-10AM.

For more information on HPSS and "hsi", type "hsi help" on Falcon or see the online documentation kept at SDSC, available at the following URL.

http://www.sdsc.edu/Storage/hsi/

What about "/tmp"?

The "/cfs*" directories are shared across Falcon and more than likely should be used for I/O when running parallel jobs. "/scratch" is scratch space local to each node for use by running jobs. "/tmp" is local to each node, but for system use only. The "/tmp" local storage on each node is very limited and used for system administration, so we strongly suggest that you do not use it. Use "/scratch" instead if you want local disk, and use "/cfs*" (or the unstable "/pfs*") if you want disk available to all processes in a parallel job.


phoenix | ram | cheetah | eagle
ornl | nccs | ccs | computers | disclaimer

URL http://www.ccs.ornl.gov/falcon/filesystems.html
Updated: Monday, 28-Oct-2002 09:59:56 EST
consult@ccs.ornl.gov