Skip to content Skip to main navigation Skip to footer

C Data Types

The compiler in the C programming language must know the type of the data in order to operate on it. Once the data type of a value is known, it is possible to know the characteristics of that value and how to manipulate it.

There are three basic data types: character (char), integer (int), and floating point (float). Complex types are built on top of them.

Character types

A character type is a single character, and the type declaration uses the char keyword.

char c = 'B';

The above example declares the variable c as a character type and assigns it to the letter B.

The C language specifies that character constants must be placed inside single quotes.

Character types are stored in a single byte (8 bits) and are treated as integers by the C programming language, so a character type is an integer with a width of one byte. Each character corresponds to an integer (ASCII code), for example, B corresponds to the integer 66.

The default range of character types varies from computer to computer. Some systems default to -128 to 127, while others default to 0 to 255. These two ranges cover exactly the ASCII character range of 0 to 127.

Integers and characters are interchangeable and can be assigned to variables of the character type as long as they are in the range of the character type.

char c = 66;
// equal to
char c = 'B';

In the above example, the variable c is a character type and the value assigned to it is the integer 66, which has the same effect as the value assigned to the character B.

Two variables of character type can perform mathematical operations.

char a = 'B'; // equal to   char a = 66;
char b = 'C'; // equal to   char b = 67;
printf("%d\n", a + b); // output 133
charttest_example1

In the above example, the character variables a and b are added together as if they were two integers. The placeholder %d indicates the output decimal integer, so the output is 133.

The single quote itself is also a character, and in order to represent this character constant, it must be escaped using a backslash.

char t = '\'';

In the above example, the variable t is a single-quoted character, and since character constants must be placed inside single quotes, the internal single quotes are escaped with a backslash.

This escaped writing style is mainly used to represent some non-printable control characters defined in ASCII codes that are also character type values.

  • \a: alarm, which causes the terminal to sound an alarm or appear to blink, or both at the same time.
  • \b: backspace, the cursor goes back one character, but does not delete the character.
  • \f: page break, the cursor moves to the next page.
  • \n: newline character.
  • \r: carriage return character
  • \t: tab character, the cursor moves to the next horizontal tab position, usually the next multiple of 8.
  • \v: vertical separator, the cursor moves to the next vertical tab, usually the same column of the next line.
  • \0: null character, representing no content. Note that this value is not equal to the number 0.
char x = 'B';
char x = 66;
char x = '\102'; // octal
char x = '\x42'; // hexadecimal

All four of the above examples are written in equivalent ways.

Integer Types

The integer type is used to represent larger integers, and the type declaration uses the int keyword.

int a;

The above example declares an integer variable a.

The size of the int type varies from computer to computer. It is more common to use 4 bytes (32 bits) to store a value of type int, but 2 bytes (16 bits) or 8 bytes (64 bits) can also be used. The range of integers they can represent is as follows.

  • 16-bit: -32,768 to 32,767.
  • 32-bit: -2,147,483,648 to 2,147,483,647.
  • 64-bit: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.

signed,unsigned

The C programming language uses the signed keyword to indicate that a type has a positive or negative sign and contains negative values, and the unsigned keyword to indicate that the type does not have a positive or negative sign and can only represent zero and positive integers.

For the int type, the default is with positive and negative signs, that is, int is equivalent to signed int. Since this is the default, the keyword signed is usually omitted, but it will not report an error if written.

signed int a;
// equal to
int a;

The int type can also be used without a positive or negative sign, to represent only non-negative integers. In this case, the variable must be declared with the keyword unsigned.

unsigned int a;

The advantage of declaring an integer variable as unsigned is that the maximum integer value that can be represented by the same length of memory is doubled. For example, the maximum value of a 16-bit signed int is 32767, while the maximum value of an unsigned int is increased to 65535.

The int keyword in unsigned int can be omitted, so the above variable declaration can also be written as follows.

unsigned a;

The character type char can also be set signed and unsigned.

signed char c; //  range -128 to 127
unsigned char c; // range 0 to 255

Note: C specifies that the char type has positive and negative signs by default, which is determined by the current system. This means that char is not equivalent to signed char, it can be either signed char or unsigned char, unlike int, which is equivalent to signed

Subtypes of integers

If the int type uses 4 or 8 bytes to represent an integer, this is a waste of space for small integers. On the other hand, in some cases, larger integers are needed, and 8 bytes are not enough. To solve these problems, C provides three integer subtypes in addition to the int type. This facilitates finer scoping of integer variables and better expresses the intent of the code.

  • short int (abbreviated as short): generally occupies 2 bytes (the integer range is -32768 to 32767).
  • long int (abbreviated as long): occupies not less than int, at least 4 bytes.
  • long long int (abbreviated as long long): occupies more space than long, at least 8 bytes.
short int a;
long int b;
long long int c;

The above code declares three variables of integer subtypes.

By default, short, long, and long long are signed (signed), i.e., the signed keyword is omitted. They can also be declared unsigned, which doubles the maximum value that can be represented.

unsigned short int a;
unsigned long int b;
unsigned long long int c;

C allows the int keyword to be omitted, so the variable declaration statement can also be written as follows.

short a;
unsigned short a;
long b;
unsigned long b;
long long c;
unsigned long long c;

The byte lengths of data types are different on different computers. When you really need a 32-bit integer, you should use the long type instead of the int type, which guarantees no less than 4 bytes;

when you really need a 64-bit integer, you should use the long long type, which guarantees no less than 8 bytes. On the other hand, to save space, you should use the short type when only a 16-bit integer is needed, and the char type when an 8-bit integer is needed.

Limit values for integer types

Sometimes we need to check the maximum and minimum values of different integer types in the current system. The C programming language header file limit.h provides the corresponding constants, such as SCHAR_MIN for the minimum value of signed char type -128 and SCHAR_MAX for the maximum value of signed char type 127.

For the sake of code portability, you should try to use these constants when you need to know the limit value of some integer type.

  • SCHAR_MIN, SCHAR_MAX: the minimum and maximum values of signed char.
  • SHRT_MIN, SHRT_MAX: the minimum and maximum values of short.
  • INT_MIN, INT_MAX: the minimum and maximum values of int.
  • LONG_MIN, LONG_MAX: the minimum and maximum values of long.
  • LLONG_MIN, LLONG_MAX: the minimum and maximum values of long long.
  • UCHAR_MAX: the maximum value of unsigned char.
  • USHRT_MAX: the maximum value of unsigned short.
  • UINT_MAX: the maximum value of unsigned int.
  • ULONG_MAX: The maximum value of unsigned long.
  • ULLONG_MAX: the maximum value of unsigned long long.

Integers in Different Bases

The integers in C are decimal numbers by default. If you want to represent octal and hexadecimal numbers, you must use a specialized representation.

Octal uses 0 as a prefix, such as 017, 0377.

int a = 012; // octal, equivalent to 10 in decimal

Hexadecimal uses 0x or 0X as a prefix, such as 0xf, 0X10.

int a = 0x1A2B; // hexadecimal, equivalent to 6699 in decimal

Some compilers use the 0b prefix for binary numbers, but it is not standard.

int x = 0b101010;

Note: The different bases are just the way the integer is written and have no effect on how the integer is actually stored. All integers are stored in binary type, independent of the way they are written. Different bases can be mixed, for example 10 + 015 + 0x20 is a legal expression.

The placeholders for printf() in different integers are as follows:

  • %d: decimal integer.
  • %o: octal integer.
  • %x: hexadecimal integer.
  • %#o: displays octal integers prefixed with 0.
  • %#x: displays a hexadecimal integer prefixed with 0x.
  • %#X: displays the hexadecimal integer prefixed with 0X.
int x = 100;
printf("dec = %d\n", x); // 100
printf("octal = %o\n", x); // 144
printf("hex = %x\n", x); // 64
printf("octal = %#o\n", x); // 0144
printf("hex = %#x\n", x); // 0x64
printf("hex = %#X\n", x); // 0X64
integer type

Floating-point type

Any value with a decimal point will be interpreted by the compiler as a floating point number.

The type declaration for floating point numbers uses the float keyword, which can be used to declare floating number variables.

float c = 10.5;

In the above example, the variable c is a floating-point type.

The float type takes up 4 bytes (32 bits), 8 of which hold the value and sign of the exponent and the remaining 24 bits hold the value and sign of the decimal. The float type can provide at least (decimal) 6 significant digits, and the exponent part ranges from (decimal) -37 to 37.

Sometimes the precision or range of values provided by 32-bit floating-point numbers is not enough, and C provides two other larger floating-point types.

  • double: Occupies 8 bytes (64 bits) and provides at least 13 valid digits.
  • long double: usually occupies 16 bytes.

Note: due to precision limitations, a floating point number is only an approximation and its calculation is not exact, for example, 0.1+0.2 in C is not equal to 0.3, but has a small error.

if (0.1 + 0.2 == 0.3) // false

C allows the use of scientific notation for floating-point numbers, using the letter e to distinguish between the fractional part and the exponential part.

double x = 123.456e+3; // 123.456 x 10^3
// equal to
double x = 123.456e3;

Boolean type

C originally did not have a separate type for Boolean values, but instead used the integer 0 for false and all non-zero values for true.

int x = 1;
if (x) {
  printf("x is true!\n");
 
}
boolean type example1

In the above example, the variable x is equal to 1. C assumes that this value represents true and therefore executes the code inside the decision body.

The C99 standard adds the _Bool type, which represents a boolean value. However, this type is really just an alias for the integer type, and still uses 0 for false and 1 for true, as shown in the example below.

_Bool isNormal;
isNormal = 1;
if (isNormal)
  printf("Everything is OK.\n");

The header file stdbool.h defines another type alias bool and defines true for 1 and false for 0. These keywords can be used as long as this header file is loaded.

#include <stdbool.h>
bool flag = false;

In the above example, after loading the header file stdbool.h, you can use bool to define the boolean type.

Literals type

A literal is a value that appears directly inside the code.

int x = 123;

In the above code, x is the variable and 123 is the literals.

Literals are also written to memory at compile time, so the compiler must specify the data type of the literal, just as it must specify the data type of the variable.

Normally, decimal integer literals (e.g. 123) are specified by the compiler as type int. If a value is larger than what int can represent, the compiler will specify it as long int. If the value exceeds long int, it will be specified as unsigned long. if it is not large enough, it will be specified as long long or unsigned long long.

Fractional numbers (e.g. 3.14) will be specified as an even type.

Literals suffix

Sometimes a programmer wants to specify a different type for a literal. For example, if the compiler specifies an integer literal as type int, but the programmer wants to specify it as type long, the literal can be suffixed with l or L, and the compiler will know to specify the type of the literal as long.

int x = 123L;

In the above code, the literal 123 has the suffix L, and the compiler will specify it as a long type.

Octal and hexadecimal values can also be specified as Long types using the suffixes l and L, such as 020L and 0x20L.

int y = 0377L;
 
int z = 0x7fffL;

If you wish to specify unsigned integers unsigned int, you can use the suffix u or U.

int x = 123U;

L and U can be used in combination to represent unsigned long types. the case and combination order of L and U does not matter.

int x = 123LU;

For floating point numbers, the compiler specifies the double type by default. If you wish to specify another type, you need to add the suffix f (float) or l (long double) after the decimal.

The following literal suffixes are commonly used.

  • f and F: Float types.
  • l and L: Long int types for integers and long double types for decimals.
  • ll and LL: Long Long types, such as 3LL.
  • u and U: denote unsigned int, such as 15U, 0377U.

Below are some examples.

int   x = 1234;
long int   x = 1234L;
long long int x = 1234LL
unsigned int     x = 1234U;
unsigned long int    x = 1234UL;
unsigned long long int x = 1234ULL;
float x   = 3.14f;
double x   = 3.14;
long double x = 3.14L;

Overflow

Each data type has a range of values, and an overflow occurs if a value stored outside this range (less than the minimum or greater than the maximum) requires more binary bits to store. A value greater than the maximum value is called an overflow; a value less than the minimum value is called an underflow.

Generally, the compiler will not report an error for overflow and will execute the code normally, but will ignore the extra binary bits and keep only the remaining bits, which often gives unexpected results. Therefore, overflow should be avoided.

unsigned char x = 255;
x = x + 1;
printf("%d\n", x); // output: 0
overflow example1

In the above example, the variable x is added with 1. The result is not 256, but 0, because x is an unsigned char type with a maximum value of 255 (binary 11111111). After adding 1, an overflow occurs and the highest bit of 256, 1 (binary 100000000), is discarded, leaving the value 0.

See the following example again:

unsigned int ui = UINT_MAX;  // 4,294,967,295
ui++;
 
printf("ui = %u\n", ui); // 0
ui--;
 
printf("ui = %u\n", ui); // 4,294,967,295

In the above example, the constant UINT_MAX is the maximum value of the unsigned int type. If you add 1, it will overflow for that type, thus getting 0. And 0 is the minimum value for that type, and then subtract 1 to get UINT_MAX again.

Overflows are easy to ignore and the compiler doesn’t report errors, so you have to be very careful.

for (unsigned int i = n; i >= 0; --i) // error

The above code seems to be fine, but the type of the loop variable i is unsigned int, and the minimum value of this type is 0. It is impossible to get a result less than 0. When i is equal to 0 and then subtracted from 1, it does not return -1, but the maximum value of type unsigned int, which is always greater than or equal to 0, resulting in an infinite loop.

To avoid overflow, the best way is to compare the result of the operation with the limit value of the type.

unsigned int a;
 
unsigned int b;
 
// error
 
if (a + b > UINT_MAX) too_big();
 
else b = a + b;
 
//correct
 
if (a > UINT_MAX - b) too_big();
 
else b = b + a;

In the above example, the variables b and a are both unsigned int, and their sum is still unsigned int, so there is a possibility of overflow. The correct way to compare them is to determine the relationship between UINT_MAX - b and a.

Here is another wrong way to write it.

unsigned int i = 5;
unsigned int j = 7;
 
if (i - j < 0) // error
  printf("negative\n");
else
  printf("positive\n");

The result of the above example will output “positive“, because both variables i and j are the unsigned int type and the result of i-j is also this type with a minimum value of 0. It is impossible to get a result less than 0.

sizeof operator

sizeof is an operator provided by the C programming language that returns the number of bytes occupied by a certain data type or a value. Its argument can be a keyword of a data type, a variable name or a specific value.

// The argument is a data type
int x = sizeof(int);
 
// The argument is a variable
int i;
sizeof(i);
 
// parameter is a numeric value
sizeof(3.14);
C sizeof example

The first example above, returns the number of bytes occupied by the int type (usually 4 or 8).

The second example returns the number of bytes occupied by an integer variable, and the result is exactly the same as the previous example.

The third example returns the number of bytes occupied by the floating-point number 3.14. Since floating point literals are always stored as double type, it will return 8 because of the 8 bytes occupied by the double type.

The return value of the sizeof operator, which C only specifies as an unsigned integer, does not specify a specific type, but leaves it up to the system to decide what type sizeof actually returns. The return value may be unsigned int, unsigned long, or even unsigned long long on different systems, and the corresponding printf() placeholders are %u, %lu, and %llu. This is not convenient for program portability.

C provides a solution by creating a type alias, size_t, to uniformly represent the return value type of sizeof. This alias is defined in the stdef.h header file (which is automatically introduced when stdio.h is introduced) and corresponds to the current system return value type of sizeof, which may be either unsigned int or unsigned long.

C also provides a constant SIZE_MAX, which indicates the maximum integer that size_t can represent. Therefore, the range of integers that size_t can represent is [0, SIZE_MAX].

printf() has a special placeholder %zd or %zu to handle values of type size_t.

printf("%zd\n", sizeof(int));

In the above code, the %zd placeholder (or %zu) is output correctly regardless of the type of the sizeof return value.

If the current system does not support %zd or %zu, you can use %u (unsigned int) or %lu (unsigned long int) as an alternative.

Automatic type conversion

In some cases, C will automatically convert the type of a value.

Assignment Operation

The assignment operator automatically converts the value on the right to the type of the variable on the left.

Assigning floating-point numbers to integer variables

When floating point numbers are assigned to integer variables, C discards the fractional part directly, rather than rounding.

int x = 3.14;

In the above example, the variable x is an integer type and the value assigned to it is a floating-point number. The compiler first automatically converts 3.14 to int, discarding the fractional part, and then assigns that value to x, so the value of x is 3.

This automatic conversion may result in the loss of some data (3.14 loses the decimal part), so it is better not to assign values across types and try to ensure that the variables have the same type and value.

Assigning integers to floating-point variables

Integers are automatically converted to floating-point numbers when assigned to floating point variables.

float y = 12 * 2;

In the above example, the value of the variable y is not 24, but 24.0, because the integer to the right of the equal sign is automatically converted to a floating-point number.

Wide and Narrow typecast in C

When a narrow byte-width integer type is assigned to a wide byte-width integer variable, the narrow type is automatically converted to a wide type.

For example, a char or short type assigned to an int type is automatically converted to int.

char x = 10;
int i = x + y;

When a type with a wider byte width is assigned to a variable with a narrower byte width, a type degradation occurs and the type is automatically converted to a type with a narrower byte width. This may result in truncation, where the system automatically truncates the extra binary bits, leading to unpredictable results.

int i = 321;
 
char ch = i; // the value of ch is 65 (321 - 256)

In the above example, the variable ch is a char type with a width of 8 binary bits. The variable i is a int type and assigns i to ch. ch can only hold the last 8 bits of i (101000001 in binary form, 9 bits in total), and the extra binary bits in front are discarded, keeping the last 8 bits as 01000001 (65 in decimal, equivalent to the character A).

Mixed Type Arithmetic

When values of different types are mixed together for calculation, they must be converted to the same type before calculation. The conversion rules are as follows.

When mixing integer and floating point operations, integers are converted to floating point types.

3 + 1.2 // 4.2

The above example is a mix of int and float types. 3 is converted to a float value of 3.0 and then calculated to get 4.2.

  • When different floating-point types are mixed, the type with narrower width is converted to the type with wider width, such as float to double and double to long double.
  • When different integer types are mixed, the type with a narrow width is converted to the type with a wider width. For example, short to int, int to long, etc.

Function Return Type

The parameters and return values of the function are automatically converted to the types specified in the function definition.

int testfunc(int, unsigned char);
char a = 10;
unsigned short b = 20;
long long int c = testfunc (m, n);

In the above example, the parameter variables a and b are converted to the parameter types defined by the function testfunc (), regardless of their original types.

The following is an example of automatic type conversion of a function return value.

char testfunc(void) {
  int a = 65;
  return a;
}

In the above example, the variable a inside the function is an int type, but the returned value is a char type because that is the type returned in the function definition.

Explicit Type Conversion

We should avoid automatic type conversions to prevent unexpected results, but C provides explicit type conversions that allow manual type conversions.

A value or variable can be converted to the specified type by specifying the type in parentheses in front of the value or variable, which is called “casting“.

(unsigned char) ch

The above example converts the variable ch to an unsigned character type.

Portability Type

The integer types in C (short, int, long) may occupy different byte widths on different computers, and it is not possible to know exactly how many bytes they occupy in advance.

For better portability of C programs, the header file stdint.h creates some new type aliases.

Exact-width integer type, which guarantees that the width of an integer type is determined.

  • int8_t: 8-bit signed integer.
  • int16_t: 16-bit signed integer.
  • int32_t: 32-bit signed integer.
  • int64_t: 64-bit signed integer.
  • uint8_t: 8-bit unsigned integer.
  • uint16_t: 16-bit unsigned integer.
  • uint32_t: 32-bit unsigned integer.
  • uint64_t: 64-bit unsigned integer.

All of the above are type aliases, and the compiler will specify the underlying type they point to. For example, on a given system, if the int type is 32-bit, int32_t will point to int; if the long type is 32-bit, int32_t will point to long.

Here is an example of usage.

#include <stdio.h>
#include <stdint.h>
 
int main(void) {
  int32_t x32 = 45933945;
  printf("x32 = %d\n", x32);
  return 0;
}
portabiliby type example1

In the above example, the variable x32 is declared as type int32_t, which is guaranteed to be 32 bits wide.

Minimum width type, which guarantees the minimum length of an integer type.

  • int_least8_t
  • int_least16_t
  • int_least32_t
  • int_least64_t
  • uint_least8_t
  • uint_least16_t
  • uint_least32_t
  • uint_least64_t

These types above are guaranteed to occupy no less than the specified width of bytes. For example, int_least8_t indicates the type that can hold an 8-bit signed integer of minimum width.

Fast minimum width type, the type that enables the fastest integer calculation.

  • int_fast8_t
  • int_fast16_t
  • int_fast32_t
  • int_fast64_t
  • uint_fast8_t
  • uint_fast16_t
  • uint_fast32_t
  • uint_fast64_t

The above types are to guarantee the byte width while pursuing the fastest arithmetic speed, for example, int_fast8_t indicates the fastest type for 8-bit signed integers.

The integer type that can hold a pointer.

  • intptr_t: Signed integer type that can store pointers (memory addresses).
  • uintptr_t: unsigned integer type that can store a pointer.

Maximum width integer type for storing the largest integer.

  • intmax_t: The type of any valid signed integer that can be stored.
  • uintmax_t: the type of any valid unsigned integer can be stored.

These two types above are wider than long long and unsigned long.

Was This Article Helpful?

21
Related Articles
0 Comments

There are no comments yet

Leave a comment

Your email address will not be published.