Javascript identifiers

Introduction

Definition:
An identifier is regarded as a value which is unique in relation to all the other identifiers in a system.

Different systems determine different rules for identifier validity values. They can be figures, sequences of various characters or a combination of both. Identifiers are applied in all spheres of life. We use them mainly when we want to distinguish clearly between similar objects. We can use postal codes as an example, or country names and many others which are a part of our everyday life. The abbreviation for identifier in Latin characters, is "ID".

In high - level programming languages, identifiers are mostly used when naming variables, functions, object properties and others. They give programmers an easy opportunity to name different data structures and use them in various language structures.

Javascript makes no exclusion and identifiers are used for:

Javascript applies the Unicode standard, version 2.1 or a later version. A UTF-16 coding is used for representing each of the program characters.

In order to define a variable or a function, a valid identifier must be used according to the rules of ECMA-262 standard.

var foo; 
function bar(){}

Identifiers (names) validity will be verified in the process of the syntax analysis of the program. If they prove invalid, the interpreter will send an error message of SyntaxError type, and this will terminate the running of the program.

var 2; //SyntaxError

Being well acquainted with the syntax rules of ECMA-262 standard enables us to create stable programs.

ECMA-262 3rd edition. Section 7.6 Identifiers
Identifier ::
IdentifierName but not ReservedWord

Reserved words from the language

Reserved words are entities with special meaning to the interpreter. They are used for a distinct differentiation of various language structures and they are admitted at precisely determined positions, acc. to the syntax rules of the language. These words cannot be used as identifiers.
In Javascript they are divided into 4 types:

Keywords

break else new var case
finally return void catch for
switch while continue function this
with default if throw delete
in try do instanceof typeof

Reserved words for future use

abstract enum int short boolean
export interface static byte extends
long super char final native
synchronized class float package throws
const goto private transient debugger
implements protected volatile double import
public        

Boolean literal

true false

Null literal

null

Each of the following examples for identifiers will display a syntax error because reserved words are used at the positions where the interpreter expects a valid identifier:

var if; //SyntaxError
var super; //SyntaxError
var true; //SyntaxError

In the following example, the three identifiers are valid:

var IF;
var SUPER;
var TRUE;

The reason for this is the sensitivity of the Javascript interpreter towards the case of the characters. The three identifiers are not reserved words because they are displayed in a different way from the words in the above table. In ECMA-262 implementations, two identifiers are considered to be the same only when they are displayed with the same sequence of Unicode values for each character.

var \u0066\u006F\u006F = true;
print(foo); //true

Identifier names

Identifier names may consist of one or more characters. The documentation does not impose any restrictions concerning the maximum length of identifiers.
ECMA-262 divides the rules concerning identifier names, in two sections.

All characters which can be used as initial characters in an identifier name, are also admitted in the rest of the name. The opposite is wrong.

var $2$; //valid identifier
var 2$2; //SyntaxError

Initial characters in identifiers

The following characters can be used as initial characters in identifiers:

Underscore sign "_"

In Javascript there are no modifiers for object property accessing. The regular underscore sign is used by programmers as an initial character in the name of a given property, to indicate that the user of the code is not advised to manipulate this property because in case of doing so, they might disturb the correct function of the program. In other words, the underscore sign at the beginning is a pseudo emulation of the "private" modifier.

$

The symbol of the American dollar and the Argentine peso is also used as an initial character in the identifier name. ECMA-262-3 recommends that this symbol be used as an initial character in automatic generated codes only. Most Javascript libraries use it in a different way in spite of the recommendations of the documentation.

A letter from the Unicode table

The Unicode standard classifies different characters in separate categories. These categories are used to put together characters which are close in their meaning. In Javascript, a letter from the Unicode table is considered one of the characters which comes under the following Unicode categories:

Each of the characters in these groups can be used as initial characters in identifiers.

Unicode escape sequence

In Javascript a character can be displayed both via its graphical representation and its value in the Unicode table. This is admitted at precisely determined positions:

In comments, escape sequences are ignored and they have no influence on the comment whatsoever.

The syntax of Unicode escape sequence is:

\u006E;

It always starts with \u and a four-digit number comes next, represented in hexadecimal notation, which is the value of the character in the Unicode table. In the above example we described the Latin character n via Unicode escape sequence.

Escape sequences are admitted everywhere in identifiers.
A character which is invalid when represented in graphical expression cannot be displayed via escape sequence. Invalid characters remain invalid regardless of the way they are represented.

As we already know, the decimal numbers (Nd) are not admitted at the beginning of identifiers. Even when represented via escape sequence at the beginning, they are still invalid and this will cause termination of the program due to syntax error:

/**
 * equivalent of:
 * var 2;
 * the expected result is a syntax error
 */
var \u0032;

In this context, an interesting issue is what the behaviour of the interpreter will be if a reserved word is represented as identifier via escape sequence.

/**
 * equivalent of:
 * var if; 
 */
var \u0069\u0066;

Before that identifier is checked for validity, the escape sequences are normalized to if. This means that the running of the program must be terminated due to syntax error because if is a reserved word.

Some implementations do not obey this rule. They admit keywords to be represented as identifiers via escape sequences. This happens because escape sequences cannot represent words with special meaning to the interpreter.

/**
 * equivalent to:
 * if(true); 
 */
\u0069\u0066(true);

The expected result in this example is a syntax error because a key word is used as identifier.

In implementations which allow key words to be represented as identifiers via escape sequences, the above example will be interpreted according to the rules applied in "11.2.3 Function Calls" from the ECMA-262-3 standard. In other words, it is accepted as a function causing rather than a conditional structure "if".

The rest of the identifier name

In this part of the names, the following groups of Unicode are admitted along with the already described characters and groups which can stand at the beginning of identifiers:

Each of the characters belonging to these categories may be used as a part of the identifier but only in the position after the first character.

Names and object property accessing

In Javascript, the object property accessing is carried out in two ways. The syntax of both notations is determined in the standard:

ECMA-262 3rd edition. Section 11.2.1 Property Accessors

Dot notation:
MemberExpression . Identifier
CallExpression . Identifier

Bracket notation:
MemberExpression [ Expression ]
CallExpression [ Expression ]

Dot notation

In dot notation, the property names are checked in accordance with the already described rules for identifiers. All invalid forms of identifiers still remain invalid after object property accessing.
var foo = new Object();
foo.if; //Syntax Error
foo.2; //Syntax Error
foo.true; //Syntax Error

In dot notation, results of expressions and variable values for object property cannot be used.

var foo = new Object(),
    bar = 'property';
/**
 * No access is given to "property" which is the value of the variable bar.
 * Access is given to a property with name "bar"
 */
foo.bar;

Square bracket notation

At square bracket notation, the result from the validity assessment of the expression is used as a property name.
The characters which make up the property name are not treated according to the syntax rules for identifiers. All characters which would be invalid according to the identifiers rules, are valid with this notation.

var foo = new Object();
foo['if']; 
foo[2];
foo[true];

The assessed result of the expression in the brackets is converted according to the algorithms described in section "9.8 ToString" of ECMA-262-3 standard and it is used as a property name.
In our example the following properties are accessed:

At square brackets notation, property names are admitted with the above listed character sequences.

The advantage of the square brackets notation to the dot notation is that property names can be created while running the program. In dot notation, such action is permitted only if a dynamic compilation is used eval or Function constructor.

var foo = new Object(),
    bar = 'property';
	
foo[bar] = true;
print(foo.property); //true

In this case, a property with name "property" is added. The expression in the brackets is assessed according to the rules of ECMA-262-3 for: "11.1.2 Identifier Reference", "10.1.4 Scope Chain and Identifier Resolution" and "8.7.1 GetValue (V)". The received value is used as a property name. A primitive Boolean value true will be appropriated to the property with name "property".

Object initialization ({ })

Object initialization or object literal.
Object literals are a part of ECMA-262-3 standard. Their advantage is the clarity and the brevity with which properties of the created object are defined.

ECMA-262 3rd edition. Section 11.1.5 Object Initialiser

ObjectLiteral :
{}
{ PropertyNameAndValueList }

PropertyNameAndValueList :
PropertyName : AssignmentExpression
PropertyNameAndValueList , PropertyName : AssignmentExpression

PropertyName :
Identifier
StringLiteral
NumericLiteral

Concerning the syntax of the object literals, within the curly braces, properties of the created object can be defined and given the values of the right-hand expressions. If more definitions of properties are added, they are divided from the previous ones via comma ,.

var obj = {},
    obj1 = {property : true},
    obj2 = {
	    property : true,
	    method : function (){
	        return true;
	    }
    };

print(obj1.property); //true

print(obj2.property); //true
print(obj2.method()); //true

Three objects are created in the example and a reference is appropriated to each of them as follows obj, obj1 and obj2 variables. The applied values of the function print are the values which have been appropriated in the process of the object initialization for their properties. In the last example, the received result from the function is applied to which the appropriated value of the method property points.

The prototype chain of the three objects is:

obj  -> Object.prototype -> null
obj1 -> Object.prototype -> null
obj2 -> Object.prototype -> null

Property Name in object literal

var obj = {
	(2 + 2) : false, //Syntax Error
	2abc : false, //Syntax Error
	if : false //Syntax Error
};

The names of the three properties are invalid according to the rules which are applied in ECMA-262-3. Upon their analysis, the program will be terminated due to syntax error.

Conclusion

Regardless of the naming conventions, it is always useful to think about the people who are going to read and maintain your code.

Use as many meaningful names as you can, thus you will definitely improve the maintenance of your programs.

Useful materials

Author: Asen Bozhilov
With the cooperation of:
Stoyan Stefanov
Dr J R Stockton

Date of publication: 2010-10-15