The compiler in the C programming language must know the type of the data in order to operate on it. Once the data type of a value is known, it is possible to know the characteristics of that value and how to manipulate it.
There are three basic data types: character (char), integer (int), and floatingpoint (float). Complex types are built on top of them.
Table of Contents
Character types
A character type is a single character, and the type declaration uses the char keyword.
charc = 'B';
The above example declares the variable c as a charactertype and assigns it to the letter B.
The C language specifies that character constants must be placed inside single quotes.
Character types are stored in a single byte (8 bits) and are treated as integers by the C programming language, so a character type is an integer with a width of one byte. Each character corresponds to an integer (ASCII code), for example,B corresponds to the integer 66.
The default range of character types varies from computer to computer. Some systems default to -128 to 127, while others default to 0 to 255. These two ranges cover exactly the ASCII character range of 0 to 127.
Integers and characters are interchangeable and can be assigned to variables of the character type as long as they are in the range of the charactertype.
charc = 66;
// equal to
charc = 'B';
In the above example, the variable c is a character type and the value assigned to it is the integer 66, which has the same effect as the value assigned to the character B.
Two variables of character type can perform mathematical operations.
chara = 'B'; // equal to char a = 66;
charb = 'C'; // equal to char b = 67;
printf("%d\n", a + b); // output 133
In the above example, the character variablesa andb are added together as if they were two integers. The placeholder %d indicates the output decimal integer, so the output is 133.
The single quote itself is also a character, and in order to represent this character constant, it must be escaped using a backslash.
chart = '\'';
In the above example, the variable t is a single-quoted character, and since character constants must be placed inside single quotes, the internal single quotes are escaped with a backslash.
This escaped writing style is mainly used to represent some non-printable control characters defined in ASCII codes that are also character type values.
\a: alarm, which causes the terminal to sound an alarm or appear to blink, or both at the same time.
\b: backspace, the cursor goes back one character, but does not delete the character.
\f: page break, the cursor moves to the next page.
\n: newline character.
\r: carriage return character
\t: tab character, the cursor moves to the next horizontal tab position, usually the next multiple of 8.
\v: vertical separator, the cursor moves to the next vertical tab, usually the same column of the next line.
\0: null character, representing no content. Note that this value is not equal to the number 0.
charx = 'B';
charx = 66;
charx = '\102'; // octal
charx = '\x42'; // hexadecimal
All four of the above examples are written in equivalent ways.
Integer Types
The integer type is used to represent larger integers, and the type declaration uses the int keyword.
inta;
The above example declares an integer variable a.
The size of the inttype varies from computer to computer. It is more common to use 4 bytes (32 bits) to store a value of type int, but 2 bytes (16 bits) or 8 bytes (64 bits) can also be used. The range of integers they can represent is as follows.
16-bit: -32,768 to 32,767.
32-bit: -2,147,483,648 to 2,147,483,647.
64-bit: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.
signed,unsigned
The C programming language uses the signedkeyword to indicate that a type has a positiveor negativesign and contains negative values, and the unsigned keyword to indicate that the type does not have a positive or negative sign and can only represent zero and positive integers.
For the inttype, the default is with positive and negative signs, that is, int is equivalent to signedint. Since this is the default, the keyword signedis usually omitted, but it will not report an error if written.
signedinta;
// equal to
inta;
The int type can also be used without a positive or negative sign, to represent only non-negative integers. In this case, the variable must be declared with the keyword unsigned.
unsigned inta;
The advantage of declaring an integer variable as unsigned is that the maximum integer value that can be represented by the same length of memory is doubled. For example, the maximum value of a 16-bit signedint is 32767, while the maximum value of an unsignedint is increased to 65535.
The intkeyword in unsigned int can be omitted, so the above variable declaration can also be written as follows.
unsigned a;
The character type charcan also be set signedand unsigned.
signedcharc; // range -128 to 127
unsigned charc; // range 0 to 255
Note: C specifies that the char type has positive and negative signs by default, which is determined by the current system. This means that char is not equivalent to signed char, it can be either signed char or unsigned char, unlike int, which is equivalent to signed
Subtypes of integers
If the int type uses 4 or 8 bytes to represent an integer, this is a waste of space for small integers. On the other hand, in some cases, larger integers are needed, and 8 bytes are not enough. To solve these problems, C provides three integer subtypes in addition to the int type. This facilitates finer scoping of integer variables and better expresses the intent of the code.
short int (abbreviated as short): generally occupies 2 bytes (the integer range is -32768 to 32767).
long int (abbreviated as long): occupies not less than int, at least 4 bytes.
long long int (abbreviated as longlong): occupies more space than long, at least 8 bytes.
shortinta;
longintb;
longlongintc;
The above code declares three variables of integer subtypes.
By default, short, long, and longlong are signed(signed), i.e., the signedkeyword is omitted. They can also be declared unsigned, which doubles the maximum value that can be represented.
unsigned shortinta;
unsigned longintb;
unsigned longlongintc;
C allows the intkeyword to be omitted, so the variable declaration statement can also be written as follows.
shorta;
unsigned shorta;
longb;
unsigned longb;
longlongc;
unsigned longlongc;
The byte lengths of data types are different on different computers. When you really need a 32-bit integer, you should use the longtype instead of the inttype, which guarantees no less than 4 bytes;
when you really need a 64-bit integer, you should use thelong long type, which guarantees no less than 8 bytes. On the other hand, to save space, you should use the shorttype when only a 16-bit integer is needed, and the char type when an 8-bit integer is needed.
Limit values for integer types
Sometimes we need to check the maximum and minimum values of different integer types in the current system. The C programming language header file limit.hprovides the corresponding constants, such as SCHAR_MINfor the minimum value of signed chartype -128and SCHAR_MAX for the maximum value of signed chartype 127.
For the sake of code portability, you should try to use these constants when you need to know the limit value of some integer type.
SCHAR_MIN, SCHAR_MAX: the minimum and maximum values of signed char.
SHRT_MIN, SHRT_MAX: the minimum and maximum values of short.
INT_MIN, INT_MAX: the minimum and maximum values of int.
LONG_MIN, LONG_MAX: the minimum and maximum values of long.
LLONG_MIN, LLONG_MAX: the minimum and maximum values of long long.
UCHAR_MAX: the maximum value of unsigned char.
USHRT_MAX: the maximum value of unsigned short.
UINT_MAX: the maximum value of unsigned int.
ULONG_MAX: The maximum value of unsigned long.
ULLONG_MAX: the maximum value of unsigned long long.
Integers in Different Bases
The integers in C are decimal numbers by default. If you want to represent octal and hexadecimal numbers, you must use a specialized representation.
Octal uses0 as a prefix, such as 017, 0377.
inta = 012; // octal, equivalent to 10 in decimal
Hexadecimal uses 0x or 0Xas a prefix, such as 0xf, 0X10.
inta = 0x1A2B; // hexadecimal, equivalent to 6699 in decimal
Some compilers use the 0bprefix for binary numbers, but it is not standard.
intx = 0b101010;
Note: The different bases are just the way the integer is written and have no effect on how the integer is actually stored. All integers are stored in binary type, independent of the way they are written. Different bases can be mixed, for example 10 + 015 + 0x20 is a legal expression.
The placeholders for printf() in different integers are as follows:
%d: decimal integer.
%o: octal integer.
%x: hexadecimal integer.
%#o: displays octal integers prefixed with 0.
%#x: displays a hexadecimal integer prefixed with 0x.
%#X: displays the hexadecimal integer prefixed with 0X.
intx = 100;
printf("dec = %d\n", x); // 100
printf("octal = %o\n", x); // 144
printf("hex = %x\n", x); // 64
printf("octal = %#o\n", x); // 0144
printf("hex = %#x\n", x); // 0x64
printf("hex = %#X\n", x); // 0X64
Floating-point type
Any value with a decimal point will be interpreted by the compiler as a floating point number.
The type declaration for floating point numbers uses the float keyword, which can be used to declare floating number variables.
floatc = 10.5;
In the above example, the variable c is a floating-point type.
The float type takes up 4 bytes (32 bits), 8 of which hold the value and sign of the exponent and the remaining 24 bits hold the value and sign of the decimal. The float type can provide at least (decimal) 6 significant digits, and the exponent part ranges from (decimal) -37 to 37.
Sometimes the precision or range of values provided by 32-bit floating-point numbers is not enough, and C provides two other larger floating-point types.
double: Occupies 8 bytes (64 bits) and provides at least 13 valid digits.
long double: usually occupies 16 bytes.
Note: due to precision limitations, a floating point number is only an approximation and its calculation is not exact, for example, 0.1+0.2 in C is not equal to 0.3, but has a small error.
if(0.1 + 0.2 == 0.3) // false
C allows the use of scientific notation for floating-point numbers, using the letter e to distinguish between the fractional part and the exponential part.
doublex = 123.456e+3; // 123.456 x 10^3
// equal to
doublex = 123.456e3;
Boolean type
C originally did not have a separate type for Boolean values, but instead used the integer 0 for false and all non-zero values for true.
intx = 1;
if(x) {
printf("x is true!\n");
}
In the above example, thevariable x is equal to1. C assumes that this value represents trueand therefore executes the code inside the decision body.
The C99 standard adds the _Bool type, which represents a boolean value. However, this type is really just an alias for the integertype, and still uses0 for false and1 for true, as shown in the example below.
_Bool isNormal;
isNormal = 1;
if(isNormal)
printf("Everything is OK.\n");
The header file stdbool.h defines another type alias booland defines true for 1 and false for 0. These keywords can be used as long as this header file is loaded.
#include <stdbool.h>
boolflag = false;
In the above example, after loading the header file stdbool.h, you can use boolto define the boolean type.
Literals type
A literal is a value that appears directly inside the code.
intx = 123;
In the above code, x is the variable and 123is the literals.
Literals are also written to memory at compile time, so the compiler must specify the data type of the literal, just as it must specify the data type of the variable.
Normally, decimal integer literals (e.g. 123) are specified by the compiler as type int. If a value is larger than what int can represent, the compiler will specify it as long int. If the value exceeds long int, it will be specified as unsigned long. if it is not large enough, it will be specified as long long or unsigned long long.
Fractional numbers (e.g. 3.14) will be specified as an even type.
Literals suffix
Sometimes a programmer wants to specify a different type for a literal. For example, if the compiler specifies an integer literal as type int, but the programmer wants to specify it as type long, the literal can be suffixed with l or L, and the compiler will know to specify the type of the literal as long.
intx = 123L;
In the above code, the literal 123 has the suffix L, and the compiler will specify it as a long type.
Octal and hexadecimal values can also be specified as Long types using the suffixes l and L, such as 020L and 0x20L.
inty = 0377L;
intz = 0x7fffL;
If you wish to specify unsigned integers unsigned int, you can use the suffix u or U.
intx = 123U;
LandU can be used in combination to represent unsignedlong types. the case and combination order ofL and U does not matter.
intx = 123LU;
For floating point numbers, the compiler specifies the double type by default. If you wish to specify another type, you need to add the suffixf(float) or l (long double) after the decimal.
The following literal suffixes are commonly used.
f and F: Float types.
l and L:Long int types for integers and long double types for decimals.
ll and LL: Long Long types, such as 3LL.
u and U: denote unsigned int, such as 15U, 0377U.
Below are some examples.
intx = 1234;
longintx = 1234L;
longlongintx = 1234LL
unsigned intx = 1234U;
unsigned longintx = 1234UL;
unsigned longlongintx = 1234ULL;
floatx = 3.14f;
doublex = 3.14;
longdoublex = 3.14L;
Overflow
Each data type has a range of values, and an overflow occurs if a value stored outside this range (less than the minimum or greater than the maximum) requires more binary bits to store. A value greater than the maximum value is called an overflow; a value less than the minimum value is called an underflow.
Generally, the compiler will not report an error for overflow and will execute the code normally, but will ignore the extra binary bits and keep only the remaining bits, which often gives unexpected results. Therefore, overflow should be avoided.
unsigned charx = 255;
x = x + 1;
printf("%d\n", x); // output: 0
In the above example, the variable x is added with 1. The result is not 256, but 0, becausex is an unsignedchar type with a maximum value of 255(binary 11111111). After adding1, an overflow occurs and the highest bit of 256, 1 (binary 100000000), is discarded, leaving the value 0.
See the following example again:
unsigned intui = UINT_MAX; // 4,294,967,295
ui++;
printf("ui = %u\n", ui); // 0
ui--;
printf("ui = %u\n", ui); // 4,294,967,295
In the above example, the constant UINT_MAX is the maximum value of the unsignedint type. If you add 1, it will overflow for that type, thus getting 0. And 0is the minimum value for that type, and then subtract 1to get UINT_MAX again.
Overflows are easy to ignore and the compiler doesn’t report errors, so you have to be very careful.
for(unsigned inti = n; i >= 0; --i) // error
The above code seems to be fine, but the type of the loop variable i is unsignedint, and the minimum value of this type is 0. It is impossible to get a result less than 0. When i is equal to 0 and then subtracted from 1, it does not return -1, but the maximum value of type unsigned int, which is always greater than or equal to 0, resulting in an infinite loop.
To avoid overflow, the best way is to compare the result of the operation with the limit value of the type.
unsigned inta;
unsigned intb;
// error
if(a + b > UINT_MAX) too_big();
elseb = a + b;
//correct
if(a > UINT_MAX - b) too_big();
elseb = b + a;
In the above example, the variables b and a are both unsignedint, and their sum is still unsignedint, so there is a possibility of overflow. The correct way to compare them is to determine the relationship between UINT_MAX - b and a.
Here is another wrong way to write it.
unsigned inti = 5;
unsigned intj = 7;
if(i - j < 0) // error
printf("negative\n");
else
printf("positive\n");
The result of the above example will output “positive“, because both variables i andj are the unsignedint type and the result ofi-j is also this type with a minimum value of 0. It is impossible to get a result less than 0.
sizeof operator
sizeofis an operator provided by the C programming language that returns the number of bytes occupied by a certain data type or a value. Its argument can be a keyword of a data type, a variable name or a specific value.
// The argument is a data type
intx = sizeof(int);
// The argument is a variable
inti;
sizeof(i);
// parameter is a numeric value
sizeof(3.14);
The first example above, returns the number of bytes occupied by the inttype (usually 4 or 8).
The second example returns the number of bytes occupied by an integervariable, and the result is exactly the same as the previous example.
The third example returns the number of bytes occupied by thefloating-point number 3.14. Since floating point literals are always stored as doubletype, it will return 8 because of the 8 bytes occupied by the double type.
The return value of the sizeofoperator, which C only specifies as an unsignedinteger, does not specify a specific type, but leaves it up to the system to decide what type sizeof actually returns. The return value may be unsignedint, unsignedlong, or even unsignedlonglongon different systems, and the corresponding printf() placeholders are %u, %lu, and %llu. This is not convenient for program portability.
C provides a solution by creating a type alias, size_t, to uniformly represent the return value type of sizeof. This alias is defined in the stdef.h header file (which is automatically introduced when stdio.h is introduced) and corresponds to the current system return value type of sizeof, which may be either unsigned int or unsigned long.
C also provides a constant SIZE_MAX, which indicates the maximum integer that size_t can represent. Therefore, the range of integers that size_t can represent is [0, SIZE_MAX].
printf() has a special placeholder %zd or %zu to handle values of type size_t.
printf("%zd\n", sizeof(int));
In the above code, the %zd placeholder (or %zu) is output correctly regardless of the type of the sizeofreturn value.
If the current system does not support %zd or %zu, you can use %u (unsigned int) or%lu (unsigned long int) as an alternative.
Automatic type conversion
In some cases, C will automatically convert the type of a value.
Assignment Operation
The assignment operator automatically converts the value on the right to the type of the variable on the left.
Assigning floating-point numbers to integer variables
When floating point numbers are assigned to integer variables, C discards the fractional part directly, rather than rounding.
intx = 3.14;
In the above example, the variable x is an integer type and the value assigned to it is a floating-point number. The compiler first automatically converts 3.14 to int, discarding the fractional part, and then assigns that value to x, so the value of x is 3.
This automatic conversion may result in the loss of some data (3.14 loses the decimal part), so it is better not to assign values across types and try to ensure that the variables have the same type and value.
Assigning integers to floating-point variables
Integers are automatically converted to floating-point numbers when assigned to floating point variables.
floaty = 12 * 2;
In the above example, the value of the variable y is not 24, but 24.0, because the integer to the right of the equal sign is automatically converted to a floating-point number.
Wide and Narrow typecast in C
When a narrow byte-width integer type is assigned to a wide byte-width integer variable, the narrow type is automatically converted to a wide type.
For example, a charor short type assigned to an int type is automatically converted to int.
charx = 10;
inti = x + y;
When a type with a wider byte width is assigned to a variable with a narrower byte width, a type degradation occurs and the type is automatically converted to a type with a narrower byte width. This may result in truncation, where the system automatically truncates the extra binary bits, leading to unpredictable results.
inti = 321;
charch = i; // the value of ch is 65 (321 - 256)
In the above example, the variable ch is a chartype with a width of 8 binary bits. The variable i is a int type and assignsi to ch. chcan only hold the last 8 bits of i (101000001 in binary form, 9 bits in total), and the extra binary bits in front are discarded, keeping the last 8 bits as 01000001 (65 in decimal, equivalent to the character A).
Mixed Type Arithmetic
When values of different types are mixed together for calculation, they must be converted to the same type before calculation. The conversion rules are as follows.
When mixing integer and floating point operations, integers are converted to floating point types.
3 + 1.2 // 4.2
The above example is a mix of int and float types. 3 is converted to a float value of 3.0 and then calculated to get4.2.
When different floating-point types are mixed, the type with narrower width is converted to the type with wider width, such as float to double and double to long double.
When different integer types are mixed, the type with a narrow width is converted to the type with a wider width. For example, short to int, int to long, etc.
Function Return Type
The parameters and return values of the function are automatically converted to the types specified in the function definition.
inttestfunc(int, unsigned char);
chara = 10;
unsigned shortb = 20;
longlongintc = testfunc (m, n);
In the above example, the parameter variables a and b are converted to the parameter types defined by the function testfunc (), regardless of their original types.
The following is an example of automatic type conversion of a function return value.
chartestfunc(void) {
inta = 65;
returna;
}
In the above example, the variable a inside the function is an int type, but the returned value is a char type because that is the type returned in the function definition.
Explicit Type Conversion
We should avoid automatic type conversions to prevent unexpected results, but C provides explicit type conversions that allow manual type conversions.
A value or variable can be converted to the specified type by specifying the type in parentheses in front of the value or variable, which is called “casting“.
(unsigned char) ch
The above example converts the variable ch to an unsigned character type.
Portability Type
The integer types in C (short, int, long) may occupy different byte widths on different computers, and it is not possible to know exactly how many bytes they occupy in advance.
For better portability of C programs, the header file stdint.h creates some new type aliases.
Exact-width integer type, which guarantees that the width of an integer type is determined.
int8_t: 8-bit signed integer.
int16_t: 16-bit signed integer.
int32_t: 32-bit signed integer.
int64_t: 64-bit signed integer.
uint8_t: 8-bit unsigned integer.
uint16_t: 16-bit unsigned integer.
uint32_t: 32-bit unsigned integer.
uint64_t: 64-bit unsigned integer.
All of the above are type aliases, and the compiler will specify the underlying type they point to. For example, on a given system, if the int type is 32-bit, int32_twill point to int; if the long type is 32-bit, int32_twill point to long.
Here is an example of usage.
#include <stdio.h>
#include <stdint.h>
intmain(void) {
int32_t x32 = 45933945;
printf("x32 = %d\n", x32);
return0;
}
In the above example, the variable x32 is declared as type int32_t, which is guaranteed to be 32bits wide.
Minimum width type, which guarantees the minimum length of an integer type.
int_least8_t
int_least16_t
int_least32_t
int_least64_t
uint_least8_t
uint_least16_t
uint_least32_t
uint_least64_t
These types above are guaranteed to occupy no less than the specified width of bytes. For example, int_least8_tindicates the type that can hold an 8-bit signed integer of minimum width.
Fast minimum width type, the type that enables the fastest integer calculation.
int_fast8_t
int_fast16_t
int_fast32_t
int_fast64_t
uint_fast8_t
uint_fast16_t
uint_fast32_t
uint_fast64_t
The above types are to guarantee the byte width while pursuing the fastest arithmetic speed, for example, int_fast8_t indicates the fastest type for 8-bit signed integers.
The integer type that can hold a pointer.
intptr_t: Signed integer type that can store pointers (memory addresses).
uintptr_t: unsigned integer type that can store a pointer.
Maximum width integer type for storing the largest integer.
intmax_t: The type of any valid signed integer that can be stored.
uintmax_t: the type of any valid unsigned integer can be stored.
These two types above are wider than long long and unsignedlong.
0 Comments