Chapter 2

Variables and Basic Types

17 minute read

Primitive Built-in Types

C++ primitive types include arithmetic types and a special type named void.

Void type has no associated value and can be used in only a few cases like the return type of a function which does not return any value.

Arithmetic Types

Arithmetic types include:

characters
integers
boolean values
floating point numbers

Arithmetic types are divided into integral types (integers, characters and boolean) and floating-point types.

The size of arithmetic types varies across machines. Hence the largest value a type can represent also varies.

Except for bool and the extended character types, the integral types may be signed or unsigned.

The types short, int, long and long long are all signed.
Unlike integral types, there are three distinct character types - char, signed char and unsigned char. Char is not the same type as signed char. But there are still only two representations. The (plain) char type follows one of the two representaions depending on the compiler.

Deciding which type to use:

Use an unsigned type when you know that the numbers cannot be negative.
Use int for integer arithmentic. If int is smaller than the maximum data value, use long long.
Do not use plain char or bool in arithmetic expressions. Use them only to hold characters or truth values. Computations using char are especially problematic because char is signed on some machines and unsigned on others. If you need a tiny integer, explicitly specify either signed char or unsigned char.
Use double for floating-point computations; float usually does not have enough precision, and the cost of double-precision calculations versus single-precision is negligible. The precision offered by long double usually is unnecessary and often entails considerable run-time cost.

Type Conversion

Among the operations many types support, there is ability to convert objects of one type to another. It happens automatically when an object of a one type is used where an object of another type is expected.

If an out of range value is assigned to an unsigned type, the result it the remainder of the value modulo the number of values that type can store.
If an out of range value is assigned to a signed type, the result is undefined.

Always avoid Udefined and Implementation-defined Behavior.

Don’t mix signed and unsigned types.

Literals

Every litral has a type. The form and value of a literal determine its type.

Integer and Floating-point Literals

Integral literals can be written using decimal, octal or hexadecimal notation.
Integers literals that begin with 0 are octal. Intger literals that begin with 0x or 0X are hexadecimal.
By default, decimal literals are signed whereas octal and hexadecimal literals can be either signed or unsigned types.
A decimal literal has the smallest type of int, long, or long long in which the literal’s value fits.
Octal and hexadecimal literals have the smallest type of int, unsigned int, long, unsigned long, long long, or unsigned long long in which the literal’s value fits.
It is an error to use a literal that is too large to fit in the largest related type.
There are no literals of type short.
Value of a literal is never negative. For example, in -42, the minus sign is not part of the literal. It is an operator that negates the value of its (literal) operand.
Floating-point literals include either a decimal point or an exponent specified using scientific notation.

Character and Character String Literals

A character enclosed within single quotes is a character literal.
Zero or more characters enclosed within double quotes is a string literal.
The type of string literal is an array of constant chars.
The compiler appends the character ‘\0’ to every string literal. Thus size of a string literal is one more than the apparent size.
Two string literals that appear adjacent to one another and that are separated only by spaces, tabs, or newlines are concatenated into a single literal.

We can override the default type of literal by using prefixes and suffixes.

Boolean Literals

The words true and false are boolean literals.

Pointer Literals

The word nullptr is a pointer literal.

Variables

Variable is a named storage that our program can manipulate. Each variable in C++ has a type. The type determines the size and layout of the variable’s memory, the range of values that can be stored in that memory and the set of operations that can be applied to the variable.

Variable Definitions

A variable definition is a type specifier followed by a comma separated list of variable names and an optional initial value.

Intialization

Intialization in C++ is different from assignment, even though they both use similar syntax.

Intialization happens when a variable is given a value at the time of creation. Assignment obliterates an object’s current value and replaces it with a new one.

List Initialization

There are several ways to intialize an object:

int cnt = 0;
int cnt = {0};
int cnt{0};
int cnt(0);

The generalized use of curly braces for intialization was added in C++11. It is called list initialization.

The compiler will not let us list initialize variables of built-in type if the initializer might lead to the loss of information.

Default Initialization

When a variable is defined without an initializer, it is default initialized.

The default value of an object of built-in types depends on where it is defined. Variable defined outside the scope of any function are initialzed to zero. Variables of built-in types defined inside a function are uninitialized. The value of uninitialized variables is undefined. It is an error to copy or to try to access the value of a variable whose value is undefined.

Each class controls how we default initialize its objects. It is for the class to decide if we can define objects of that type without initialization. They can supply an appropiate default value. Some classes force that their objects be initialized by forcing compiler to complain.

Variable Declaration and Definitions

Separate Compilation: C++ allows us to split a program into several files which can be compiled separately.

To support separate compilation, C++ differentiates between variable declaration and definition. A declaration makes the name known to the program. A definition creates the associated entity.

A definition is also a declaration. Apart from providing the name and type of the variable, a definition also allocates storage and may provide the variable with a initial value.

To provide a declaration which is not a definition we add an extern keyword and may not provide an explicit initializer: extern int i;

It is an error to provide an initializer on an extern inside a function.

Any declaration that includes an explicit initializer is a definition. We can provide an initializer on a variable defined as extern, but doing so overrides the extern. An extern that has an initializer is a definition: extern double pi = 3.14;

Scope of a Name

A scope is a part of the program in which a name has a particular meaning. Most scopes in C++ are delimited by curly braces.

The same name can refer to different entities in different scopes. Names are visible from the point where they are declared until the end of the scope in which the declaration appears.

Define a variable where it is used for the forst time.

A name that has been declared in a scope is visible in its nested scopes. Names decalred in the outer scope can be redfined in an inner scope.

Compound Types

A compound type is a type that is defined in terms of another type. Pointers and references are two examples of compound types.

References

The new standard introduces a new type of reference called r-value reference, primarily used inside classes. This discussion is related to l-value reference.

A reference defines an alternate name for an object. If the name of the reference being declared is d, we define a reference type by writing &d.

int ival = 1024;
int &refVal = ival;  // refVal refers to (is another name for) ival
int &refVal2;        // error: a reference must be initialized

A reference must always be intialized. When we define a reference, we bind the reference to the intializer object. We cannot rebind a reference to a different object, thus reference must always be initialized.

A reference is not an object, rather just a name for an already existing object. Thus we cannot define references to a reference.

All operations on a reference are actually operations on the object the reference is bound to.

The types of reference must be exactly same as the object it is refering to. There are two exceptions. First is const references can bind to non const objects. (Revisit)

Pointers

A pointer is a compound type that points to another type.

Like references, pointers are used for indirect access to other objects.
Unlike reference, a pointer is an object in its own right.
Pointers can be assigned and copied. A pointer can point to several different objects during its lifetime.
Unlike reference a pointer need not be initialized when its defined.
Like other built-in types, pointers defined at block scope have undefined value if they are not initialized.

A pointer is defined by writing declarator of form *d. The address of operator is used to get the address of an object.

Because references are not objects, they don’t have addresses. Thus, we cannot define a pointer to a reference.

The type of a pointer must match to the object it is pointing to. Because the type of the pointer is used to infer the type of the object pointer is poining to. If pointer could point to an object of another type, operations performed on the underlying object will fail. There are two exceptions. First is a pointer to a const can point to a non const object. (Revisit)

int g = 10;
double pi = 3.14;

int &iref1 = 10;  // invalid: can't have references to literals
int &iref2 = pi;  // invalid: types don't match while initialization
int &iref3 = g;  iref3 = pi; // valid: no initialization, only narrowing assignment
double &dref1 = pi; dref1 = g; // valid: no initialization, widening assignment

int *iptr1 = &pi;  // invalid: types mismatch
int *iptr2 = &g;  iptr2 = &pi;  // still invalid: no conversion for pointers

The value (i.e., the address) stored in a pointer can be in one of four states:

It can point to an object.
It can point to the location just immediately past the end of an object. (Used in array traversals and iteratorrs; must never be dereferenced.)
It can be a null pointer, indicating that it is not bound to any object.
It can be invalid; values other than the preceding three are invalid. Result of accessing an invalid pointer is undefined.

Pointers in state 2 and 3 are valid, but they do not point to any object. Beahvior when accessing an object via these pointers is undefined.

Dereference operator (*) is used to access an object pointed by a pointer. Since dereferencing a pointer yields the object to which the pointer points, we can assign to that object by assigning to the result of dereference. *p = 0 assigns 0 to object p is pointing to.

Key Concept: Some Symbols Have Multiple Meanings Some symbols, such as & and *, are used as both an operator in an expression and as part of a declaration. The context in which a symbol is used determines what the symbol means:

int i = 42;
int &r = i;   // & follows a type and is part of a declaration; r is a reference
int *p;       // * follows a type and is part of a declaration; p is a pointer
p = &i;       // & is used in an expression as the address-of operator
*p = i;       // * is used in an expression as the dereference operator
int &r2 = *p; // & is part of the declaration; * is the dereference operator

In declarations, & and * are used to form compound types. In expressions, these same symbols are used to denote an operator. Because the same symbol is used with very different meanings, it can be helpful to ignore appearances and think of them as if they were different symbols.

Null Pointers

A null pointer does not point to anything. A null pointer can be initialized in the following ways.

int *p1 = nullptr; // equivalent to int *p1 = 0;
int *p2 = 0;       // directly initializes p2 from the literal constant 0
// must #include cstdlib
int *p3 = NULL;    // equivalent to int *p3 = 0;

nullptr is a literal with a special type which can be converted to any other pointer type.

NULL is a preprocessor vriable defined in cstdlib. Preprocessor variables are handled by preprocessor and are not a part of standard namespace.

Initialize all pointers: Define a pointer only after the pointed object has been defined. If not possible, initialize the pointer to nullptr.

void* pointers

The void* is a special pointer type which can hold address of any object. The type of the object a void* pointer points to is unknown.

There are only a limited number of things we can do with a void* pointer: We can compare it to another pointer, we can pass it to or return it from a function, and we can assign it to another void* pointer. We cannot use a void* to operate on the object it addresses—we don’t know that object’s type, and the type determines what operations we can perform on the object.

Understanding Compound Type Declarations

References to a pointer

int i = 42;
int *p;        // p is a pointer to int
int *&r = p;   // r is a reference to the pointer p
r = &i; //  r refers to a pointer; assigning &i to r makes p point to i
*r = 0; //  dereferencing r yields i, the object to which p points; changes i to 0

The easiest way to understand the type of r is to read the definition right to left. The symbol closest to the name of the variable (in this case the & in &r) is the one that has the most immediate effect on the variable’s type. Thus, we know that r is a reference. The rest of the declarator determines the type to which r refers. The next symbol, * in this case, says that the type r refers to is a pointer type. Finally, the base type of the declaration says that r is a reference to a pointer to an int.

`const` Qualifier

A const variable’s value cannot be changed, thus must always be initialized. A const type can use most but not all of the operations of the non-const type.

By default, const objects are local to a file. If a const object is initialized from a compile time constant, compiler replaces all the occurences of that variable with its value. To do this replacement compiler must see the definition of the variable and since files are compiled separately, the scope of a const object is limited to the file. Thus same named constants in different files are different in each file.

If a constant variable has to be shared across files but its initializer is not a compile time constant, it must be defined in exactly one file and can be declared in other files. To define a single instance of cont variable across files, use extern keyword in both its definition and declaration.

References to `const`

const int ci = 1024;
const int &r1 = ci;   // ok: both reference and underlying object are const
r1 = 42;              // error: r1 is a reference to const
int &r2 = ci;         // error: non const reference to a const object

There are two exceptions to the rule that the type of a reference must match the type of the object to which it refers. The first exception is that we can initialize a reference to const from any expression that can be converted to the type of the reference.

int i = 42;
const int &r1 = i;      // we can bind a const int& to a plain int object
const int &r2 = 42;     // ok: r1 is a reference to const
const int &r3 = r1 * 2; // ok: r3 is a reference to const
int &r4 = r * 2;        // error: r4 is a plain, non const reference

Pointers and `const`

Like references, we can define pointers which points to wither const or nonconst types.
Like const references, a pointer to a const may not be used to change the object it is pointing to. Pointer of a const object can only be stored in a pointer to const.

const double pi = 3.14;     // pi is const; its value may not change
double *ptr = &pi;          // error: ptr is plain pointer
const double *cptr = &pi;   // ok: cptr may point to a const double
*cptr = 42;                 // error: cannot assign to *cptr

There are two exceptions to the rule that the types of a pointer and the object to which it points must match. The first exception is that we can use a pointer to const to point to a nonconst object.

double dval = 3.14;       // dval is a double; its value can be changed
cptr = &dval;             // ok: but can't change dval through cptr

const pointers

Unlike references, pointers are objects. Hence we can have pointers which are themselves constant. A constant pointer must be initialzed and post that its value cannot be changed. We indicate that a pointer is const by putting const keyword after * in its definition.

int errNumb = 0;
int *const curErr = &errNumb;  // curErr will always point to errNumb
const double pi = 3.14159;
const double *const pip = &pi; // pip is a const pointer to a const object

The fact that a pointer is itself const says nothing about whether we can use the pointer to change the underlying object. Whether we can change that object depends entirely on the type to which the pointer points.

Top-Level const

We use the term top-level const to indicate that the pointer itself is a const. When a pointer can point to a const object, we refer to that const as a low-level const.

Top-level consts can appear in any object type i.e. arithmetic, class and pointer types. Low-level consts appear in base type of compound types like references and pointers.

int i = 0;
int *const p1 = &i;         // we can't change the value of p1; const is top-level
const int ci = 42;          // we cannot change ci; const is top-level
const int *p2 = &ci;        // we can change p2; const is low-level
const int *const p3 = p2;   // right-most const is top-level, left-most is not
const int &r = ci;          // const in reference types is always low-level

When we copy an object top-level consts are ignored. While low-level consts are never ignored.

`constexpr` and Constant Expressions

A constant expression is an expression whose value cannot change and that can be evaluated at the compile time. A constant object that is initialized from a constant expression or constexpr functions is also a constant expression.

const int max_files = 20;           // max_files is a constant expression
const int limit = max_files + 1;    // limit is a constant expression
int staff_size = 27;                // staff_size is not a constant expression
const int sz = get_size();          // sz is not a constant expression

In C++11 standard, we can ask the compiler to verify that a variable is a constant expression by declaring the variable in a constexpr declaration.

constexpr int mf = 20;          // 20 is a constant expression
constexpr int limit = mf + 1;   // mf + 1 is a constant expression
constexpr int sz = size();      // ok only if size is a constexpr function

Literal Types: The types which can be used in a constant expression. Examples, arithmetic, reference and pointer types. Note that pointers and references can be defined as constexprs but the values these can be initialized to are limited:

We can initialize a constexpr pointer from a nullptr or the literal 0.
We can also point to (or bind to) an object that remains at a fixed address. Variable defined inside the functions are not stored a fixed address. Address of any object defined outside a function is fixed. Functions may define variables that exist across function calls, thereby having a fixed address. (Revisit)

Dealing with Types

Type Aliases

There are two ways to define type aliases:

typedef: Syntax is same as that of a variable declaration except type aliases are defined instead of variables.
Alias declaration: C++11 standard introduces alias declaration. Syntax starts with using keyword followed by alias name and an equal =. The type appears on right side of =.

typedef char *pstring;
const pstring cstr = 0;     // cstr is a constant pointer to
charconst pstring *ps;      // ps is a pointer to a constant pointer to char

The base type in these declarations is const pstring. As usual, a const that appears in the base type modifies the given type. The type of pstring is “pointer to char.” So, const pstring is a constant pointer to char—not a pointer to const char.

Last modified May 28, 2025: Fix weights of cpp primer (7877b0b)