It was used to test the Bourne shell and routinely wrought havoc on unwary programs such as backup programs. When Steve Bourne was writing his Unix shell which came to be known as the Bourne shell , he made a directory of files with one-character names, one for each byte value except '0' and slash, the two characters that cannot appear in Unix file names. He used that directory for all manner of tests of pattern-matching and tokenization.
The test directory was of course created by a program. For years afterwards, that directory was the bane of file-tree-walking programs; it tested them to destruction. Note that the directory must have contained entries. This doesn't affect the effectiveness of the anecdote, or the careful testing it describes.
Instead of creating a blacklist of characters, you could use a whitelist. All things considered, the range of characters that make sense in a file or directory name context is quite short, and unless you have some very specific naming requirements your users will not hold it against your application if they cannot use the whole ASCII table.
It does not solve the problem of reserved names in the target file system, but with a whitelist it is easier to mitigate the risks at the source. And any additional safe characters you wish to allow. Beyond this, you just have to enforce some additional rules regarding spaces and dots. This is usually sufficient:. This already allows quite complex and nonsensical names.
In one of my applications, I used the same rules as above but stripped any duplicate dots and spaces. Well, if only for research purposes, then your best bet is to look at this Wikipedia entry on Filenames. If you want to write a portable function to validate user input and create filenames based on that, the short answer is don't. Take a look at a portable module like Perl's File::Spec to have a glimpse to all the hops needed to accomplish such a 'simple' task. Windows will popup a message box telling you the list of illegal characters.
The best suggestion I could come up with was to let the user name the file however he likes. Using an error handler when the application tries to save the file, catch any exceptions, assume the filename is to blame obviously after making sure the save path was ok as well , and prompt the user for a new file name.
For best results, place this checking procedure within a loop that continues until either the user gets it right or gives up. Worked best for me at least in VBA. In Windows 10 , the following characters are forbidden by an error when you try to type them:. When creating internet shortcuts in Windows, to create the file name, it skips illegal characters, except for forward slash, which is converted to minus.
In Unix shells, you can quote almost every character in single quotes '. Except the single quote itself, and you can't express control characters, because is not expanded. Accessing the single quote itself from within a quoted string is possible, because you can concatenate strings with single and double quotes, like 'I'''m' which can be used to access a file called 'I'm' double quote also possible here.
So you should avoid all control characters, because they are too difficult to enter in the shell. The rest still is funny, especially files starting with a dash, because most commands read those as options unless you have two dashes -- before, or you specify them with. If you want to be nice, don't use any of the characters the shell and typical commands use as syntactical elements, sometimes position dependent, so e.
When you are mean, your file names are VT escape sequences ;- , so that an ls garbles the output. I had the same need and was looking for recommendation or standard references and came across this thread. My current blacklist of characters that should be avoided in file and directory names are:. Used as a path name component separator in Unix-like, Windows, and Amiga systems.
Allowed in Unix filenames, see Note 1. Allowed in Unix filenames, see note 1. Colon is also used in Windows to separate an alternative data stream from the main file. In other OSes, usually considered as part of the filename, and more than one period full stop may be allowed.
In Unix, a leading period means the file or folder is normally hidden. Allowed, but the space is also used as a parameter separator in command line applications. This can be solved by quoting the entire filename. Maximum 9 character base name limit for sequential files without extension , or maximum 6 and 3 character extension for binary files; see 6.
Maximum 8 character base name limit and 3 character extension; see 8. Paths can be up to 32, characters. Single-level directory structure with disk letters A—Z. Maximum of 8 character file name with maximum 8 character file type, separated by whitespace. Later versions of VM introduced hierarchical filesystem structures, SFS and BFS, but the original flat directory 'minidisk' structure is still widely used. Flat filesystem with no subdirs.
A full 'file specification' includes device, filename and extension file type in the format: dev:filnam. Disks and tape drives are addressed either using a label up to 8 characters or a unit specification.
The HP file system does not use directories, nor does it use extensions to indicate file type. I was not looking forward to turning it on and having to index the entire drive or whatever, or to installing some shady software. When I extracted it, it created a folder …… […]. I've squeezed an insane amount of value into a single download package! Over 36 hours of my greatest work so far, with lots more being developed. Check it out! Buy at a fair price. Learn more. My head is constantly full of ideas and I'm always working on something new.
Get notified by email whenever I release new content. No spam, ever. I promise! Email: info jodyhatton. Well, Long path tool can help on this situation. It can be deleted in linux OS,if you have access to. LongPathTool is payware, unfortunately. Not paying for something I only need once every 2 years.
Use any character in the current code page for a name, including Unicode characters and characters in the extended character set — , except for the following:. Characters whose integer representations are in the range from 1 through 31, except for alternate data streams where these characters are allowed.
For more information about file streams, see File Streams. Use a period as a directory component in a path to represent the current directory, for example ".
For more information, see Paths. Use two consecutive periods.. Also avoid these names followed immediately by an extension; for example, NUL. For more information, see Namespaces. Do not end a file or directory name with a space or a period. Although the underlying file system may support such names, the Windows shell and user interface does not. However, it is acceptable to specify a period as the first character of a name. For example, ". When you create a long file name, Windows may also create a short 8.
This 8. Not all file systems follow the tilde substitution convention, and systems can be configured to disable 8. Therefore, do not make the assumption that the 8. To request 8. This is true even if a long file name contains extended characters, regardless of the code page that is active during a disk read or write operation.
Files using long file names can be copied between NTFS file system partitions and Windows FAT file system partitions without losing any file name information.
In this case, the short file name is substituted if possible. The path to a specified file consists of one or more components , separated by a special character a backslash , with each component usually being a directory name or file name, but with some notable exceptions discussed below. It is often critical to the system's interpretation of a path what the beginning, or prefix , of the path looks like.
This prefix determines the namespace the path is using, and additionally what special characters are used in which position within the path, including the last character. Each component of a path will also be constrained by the maximum length specified for a particular file system. In general, these rules fall into two categories: short and long. Note that directory names are stored by the file system as a special type of file, but naming rules for files also apply to directory names.
To summarize, a path is simply the string representation of the hierarchy between all of the directories that exist for a particular file or directory name.
For Windows API functions that manipulate files, file names can often be relative to the current directory, while some APIs require a fully qualified path.
A file name is relative to the current directory if it does not begin with one of the following:. If a file name begins with only a disk designator but not the backslash after the colon, it is interpreted as a relative path to the current directory on the drive with the specified letter.
Note that the current directory may or may not be the root directory depending on what it was set to during the most recent "change directory" operation on that disk.
0コメント