The name translation rules for Objective-C methods and properties are described in terms of a process called “omit needless words”. This is a complicated series of heuristics tuned to the Coding Guidelines for Cocoa, particularly as applied to Apple's SDKs. The full specification for this is, unfortunately, the implementation (in lib/Basic/StringExtras.cpp), because any changes can affect source compatibility.
If you're just looking for a high-level description of how things import, you should probably skip this section and stick to the main CToSwiftNameTranslation.md.
At a high level, omit-needless-words takes as inputs:
and produces a new property or method name through the following algorithm:
If the result type is the same as the context type, do a leading type name match (see below) against the base name. If the next word after the match is a preposition, and there's something following the preposition, drop the matching type name.
If we're matching a method name rather than a property, do a self type match (see below) against the base name.
If we're matching the name of a property or a method with no arguments, and the result type is the same as the context type, do a trailing type name match with special cases (see below) against the base name, then skip to the last step.
If the base name starts with “set”, do a trailing type name match with special cases (see below) against it, using the context type.
If we‘re matching a method with at least one argument but we don’t have a name for the first argument yet, split the base name (see below) to get a new base name and a name for the first argument.
If there are parameters and the first argument name is still empty, perform a trailing type name match with special cases (see below) against the base name, using the first parameter type.
Do a trailing type name match with special cases (see below) against each argument name, using the corresponding parameter type.
Initialism-lowercase (see below) the first word of the base name and the first word of each argument name.
In order to make decisions about how to translate method names into Swift, the compiler has a hardcoded list of known prepositions and “base verbs” in lib/Basic/PartsOfSpeech.def. The possible word kinds for Swift are “preposition”, “gerund”, “verb”, and “other”.
If a word W case-insensitively matches a known preposition, it's a preposition. (“within”)
If a word W ends with the characters “ing”,
If a word W starts with the characters “auto”, “re”, or “de”, try dropping that prefix. If the resulting word is a verb, W is a verb. This process is recursive. (“autoresend” matches “send”)
Otherwise, W does not have a known part of speech and is considered “other”.
The hardcoded verb list has several oddities, including a number of British spellings when most of the Apple SDKs use American spellings. “Un-” is also not one of the prefixes checked for when seeing if something is a verb. Unfortunately, changing these rules could break source compatibility.
Swift‘s convention for lowerCamelCased names (like property names, method base names, and argument names) says that the first word should be all-lowercase if it’s an initialism. For example, a property that represents a URL could be called “url”. Initialism-lowercasing is the transformation that tries to lowercase a possible initialism at the start of a word.
Examples:
The compiler sometimes needs to check if a name N matches a set of known property names P. The algorithm for this is
Certain suffixes can be stripped from type names for the purposes of matching (see below). The rules for this are fairly straightforward, using the word-splitting algorithm described above:
Examples of stripped prefixes:
A “type word match” attempts to match a “name word” N against a “type word” T. Both are assumed to be words from the word-splitting algorithm described above.
A “type name match” finds the maximum number of type-word-matching words between a series of “name words” NN and a series of “type words” TT. The search in NN can be anchored at the start (a “leading type name match”) or the end (a “trailing type name match”); the search in TT is anchored at the end in both cases. For a leading type name match:
Examples of leading type name matches, with the “overlap” in bold:
A trailing type name match is conceptually the same but has a simpler search strategy:
Examples of trailing type name matches, with the “overlap” in bold:
In practice, while the leading type name match algorithm is used unaltered, trailing type name matches are performed with a variety of additional special cases. In this case, we consider
a series of name words NN
a series of type words TT
an optional additional series of element words EE for a collection element type
a set of known property names
a match kind that is one of the following:
In addition to the rules for matching words as described in normal type word matching,
The general pattern of walking backwards still occurs.
Once a match offset into NN has been found, trailing type name matches still go through some additional checks before stripping the matching range. If any of these conditions are true, the name is left alone.
If none of these conditions are true, the matching part of NN is removed and the remaining string is returned.
Objective-C method names often include a reference to the context type, such as dismissViewControllerAnimated:
. In Swift, that context type reference is considered superfluous and the result (before base name splitting) should be dismissAnimated(_:)
. To strip it out, omit-needless-words does a modified trailing type name match with special cases (yes, there‘s yet another variant of this thing) against the method base name, using the contextual type. What’s different:
Additionally, for every match after stripping the last word of a base name, type suffix stripping is applied as much as possible before doing any matches. This was probably unintentional.
Because Objective-C has selector pieces rather than argument labels, it doesn't have a separate notion of “base name” and “first argument label”. Base name splitting is an attempt to bridge that gap.
If the parameter is a known boolean type and the last word of the base name is “Animated” (case-sensitive in this case), drop that last word and set the first argument label to “animated”.
If the first word of the base name is “set”, don't do any splitting.
This probably should have been tested before step #1.
If the last word of the parameter type is “Object” (which includes the id
type) and the body parameter name is “sender”, assume the method is IBAction-like and don't do any splitting.
Search backwards word-by-word through the base name to find the last preposition in the name at position P.
If the preposition at P is “in” and the previous word is “plug”, give up and don't do any splitting---we found the noun “plug-in” instead.
Check the preposition at P and the following word. If it is any of the following pairs, give up and don‘t do any splitting---these don’t describe the argument.
Check the preposition at P and the previous word. If it is any of the following pairs, consider the previous word to be part of the preposition as well and move P back one word.
If the rest of the base name following the preposition is “X”, “Y”, or “Z”, let the “X”, “Y”, or “Z” be the new argument name, and drop it from the base name to form the new base name. Otherwise, let everything before P be the new base name, and everything after P, including the preposition, be the argument name.
If the preposition is “with”, and the following word is not “zone”, and the parameter is not a function type and does not have a default argument, drop the “with” entirely.
If the preposition is “using”, and the parameter is not a function type and does not have a default argument, drop the “using” entirely.
If the new base name is a reserved Swift member name (currently “init”, “self”, “Protocol”, and “Type”), don't do any splitting after all.
If the first word of the new base name is “get”, “for”, “set”, “using”, or “with” (a “vacuous” word), and that‘s either the entire new name or there’s only one other word, don't do any splitting after all.
Otherwise, return the new base name and the argument name.
Strip off any typedefs, except those explicitly mentioned below, as well as any pointer or reference types and any other forms of type sugar (like type attributes).
The typedef “BOOL” gets a type name of “Bool” and a special note that it's a boolean type.
The typedefs “NSInteger”, “NSUInteger”, and “CGFloat” are preserved as is.
Typedefs for pointers whose names end in “Array” or “Set” have their names preserved; their pointee type goes through this process as well to be used as the “collection element type” (as described in “trailing type name matching with special cases” above).
Typedefs that refer to CF types (see CToSwiftNameTranslation.md) are preserved as is.
C array types get a type name of “Array”; their element type goes through this process as well to be used as the “collection element type”.
Objective-C selectors (SEL
) get a type name of “Selector”.
An Objective-C object pointer with only a single protocol and a base type of either id
or NSObject
gets mapped as the name of that protocol.
A generic Objective-C class whose name ends in “Array” or “Set” has its name preserved, even if there are protocol qualifications. Its first generic parameter, if present, goes through this process as well to be used as the “collection element type”. If no generic parameters are present, “Object” is used as the collection element type.
A non-generic Objective-C class whose name ends in “Array” or “Set” has its name preserved, even if there are protocol qualifications. Moreover, the same name with the word “Array” or “Set” dropped from the end is used as the “collection element type”.
Other Objective-C classes have their names preserved as is, even if there are protocol qualifications on the type.
Objective-C id
gets a type name of “Object”, even if it has protocol qualifications.
Objective-C Class
gets a type name of “Class”, even if it has protocol qualifications.
Tag types (structs, enums, and unions) have their names preserved as is, or their immediately-containing typedef if the tag type itself is anonymous.
All block types get a type name of “Block” and a special note that they are function types.
All C function types get a type name of “Function” and a special note that they are function types.
The built-in types below are mapped as shown:
void
becomes “Void”float
becomes “Float”double
becomes “Double”char8_t
becomes “UInt8”char16_t
becomes “UInt16”char32_t
becomes “UnicodeScalar”The built-in bool
(_Bool
) type is mapped to “Bool” with a special note that it's a boolean type.
All C integer types are mapped to “IntN” or “UIntN” based on their signedness and bit-width, including char
and wchar_t
.