Skip to main content

Command Palette

Search for a command to run...

C Tokens and Declaration Rules

A comprehensive guide to understanding C Tokens, Format Specifiers, and avoiding common printf logic errors.

Published
โ€ข6 min read
C Tokens and Declaration Rules
S

๐Ÿ‘‹ Hey, I'm Shreya, a Senior Software Engineer in an Leading MNC. Join me on this blog as I share insights, tips, and experiences. Let's explore these fascinating worlds together! ๐Ÿš€๐Ÿ’ป #CodeAndSecure

In this article, we will break down the fundamental building blocks of C Tokens and explore the fascinating (and sometimes tricky) behaviour of the printf function.


1. What are C-Tokens?

In the English language, the smallest individual unit is a letter. If "English" is considered a language, "C" is also a language. In C, the smallest individual unit of code is called a Token.

The compiler breaks your code down into these tokens to understand what you want to do.

The 5 Types of Tokens

C Tokens are categorized into five distinct groups. Here is a breakdown of the standard counts for a typical implementation:

  1. Keywords: Reserved words with special meaning (32 standard keywords).

  2. Operators: Symbols that perform operations (approx. 45).

  3. Separators: Symbols used to separate code blocks or lines (approx. 14).

  4. Constants: Fixed values (like 10, "hello").

  5. Identifiers: Names given to variables and functions (like main, printf).

Token Analysis Example

Let's analyze a simple program to count the tokens.

The Rule: You can have n number of spaces, tabs, or newlines between two tokens without affecting the code logic.

void main()
{
    printf("hello");
}

Token Count Breakdown (11 Tokens Total):

  1. void (Keyword)

  2. main (Identifier)

  3. ( (Operator/Separator - depending on context)

  4. )

  5. { (Separator)

  6. printf (Identifier)

  7. (

  8. "hello" (String Constant)

  9. )

  10. ; (Separator)

  11. } (Separator)

Note on Separators: In the context of expressions, symbols like ( ) function as operators. However, in function declarations (like void main()), they act as separators or delimiters.

2. The printf Anatomy: Arguments and Specifiers

The printf function is more versatile than most beginners realize. Its behavior relies entirely on how you map Format Specifiers to Arguments.

The Basics

  • Argument 1: Must be a string (the format string).

  • Argument 2, 3, ... n: The values to be substituted into the format string.

  • Separator: Every argument must be separated by a comma.

How Replacement Works

The compiler replaces the format specifiers (like %d) sequentially with the provided arguments.

Example 1: Basic Mapping

// 3 arguments total
printf("hello %d abcd %d", 10, 20);
  • 1st %d is replaced by 10 (2nd arg).

  • 2nd %d is replaced by 20 (3rd arg).

  • Output: hello 10 abcd 20.

Example 2: Interleaved Text

printf(" %d hai %d xyz", 10, 20);

Output: 10 hai 20 xyz.

3. The "Garbage Value" Trap

What happens when your mapping is incorrect? C is very forgiving regarding syntax but brutal regarding logic. If you provide fewer arguments than format specifiers, you enter undefined behavior territory.

Case A: More Specifiers than Arguments

printf("%d %d %d", 10, 20);
  • First %d โ†’10

  • Second %d โ†’ 20

  • Third %d โ†’ Garbage Value (The compiler pulls whatever random data is sitting in that memory location).

  • Output: 10 20 41245 (Last number will vary).

Case B: Expressions in printf

You can perform math directly inside the argument list.

printf("%d", 3 + 2);
  • Output: 5.

However, look at this tricky scenario:

printf("%d + %d", 3 + 2);
  • The first %d takes the result of 3 + 2 (which is 5).

  • The second %d has no matching argument.

Output: 5 + [Garbage Value].

One of the most common mistakes beginners make is assuming the text inside the quotes affects the calculation outside the quotes. Let's look at this tricky example:

printf("%d * %d = %d", 5, 2, 5 + 2);

What you might expect: 5 * 2 = 10 What actually happens: 5 * 2 = 7

Why? Let's break it down:

  1. The String (Visual): The format string is "%d * %d = %d". The * symbol here is just a character, like a letter or a comma. It has no mathematical power. It simply prints an asterisk to the screen.

  2. The Arguments (Logical):

    • The 1st %d is replaced by 5.

    • The 2nd %d is replaced by 2.

    • The 3rd %d is replaced by the result of the 3rd argument expression: 5 + 2.

Since the code explicitly asks the computer to add (5 + 2), the result is 7, even though the text displays a multiplication sign (*).

Beware: Never trust the text inside the quotes to tell you what math is happening! Always look at the arguments after the comma.

4. Type Mismatches and Compilation Errors

C requires you to match the type of data with the correct specifier.

Float vs Int Mismatch

printf("%d", 5.5);
  • You are passing a float (5.5) but asking printf to treat it as an integer (%d).

  • Result: Garbage Value. The binary representation of a float looks like a massive random number when interpreted as an integer.

Modulo Operator Error

printf("%f", 5.0 % 2);
  • Result: Compilation Error.

  • Reason: The modulus operator (%) only works on integers. It cannot be applied to float or double values.


5. Variable Naming and Validations

When naming variables (Identifiers), C is flexible, but there are rules.

Valid declarations:

float _;         // Underscore is valid
float _1, _2;    // Numbers allowed after first char
float _val;

Length of Variables: Strictly speaking, there is no hard restriction on variable name length in modern compilers. However, programmers recommend keeping names under 31 (or sometimes 63) characters to ensure uniqueness and readability.

A Note on Argument Compilation Checks

Look at this edge case involving strings and variables:

int ts = 7500;
printf(ts);

Result: Compilation Error. The first argument of printf must be a string (or a pointer to a char). You cannot pass an integer variable directly as the format string.

6. What is the "Output" of printf? (The Return Value)

We know printf sends text to the screen, but did you know the function itself has a "return value"?

In C, printf is an int function. It returns an integer representing the total number of characters successfully printed to the screen.

The Hidden Count

If you print the word "Hello", printf returns 5. If you print "Hi\n", it returns 3 (because the newline \n counts as 1 character).

Classic Interview Question:

int x = printf("Hello");
printf(" %d", x);

Output: Hello 5

  • First, printf("Hello") executes, printing "Hello" to the screen.

  • It returns 5 (the length of "Hello").

  • The value 5 is stored in x.

  • The second printf prints that value.

Nested printf

You can even put a printf inside another printf !

printf("%d", printf("C"));
  • Step 1: The inner printf("C") runs first. It prints C and returns 1.

  • Step 2: The outer printf receives that 1 and prints it.

  • Final Output: C1

7. The Case of the Ignored Arguments

What happens if you provide arguments (values) but forget to put the format specifiers (like %d) in the string?

printf("hello", "hai", "bye");

Analysis:

  1. 1st Argument: "hello" (This is the format string).

  2. 2nd Argument: "hai"

  3. 3rd Argument: "bye"

Output: hello

Why? printf looks at the first argument to decide what to do. Since "hello" contains no format specifiers (no %), printf simply prints "hello" and stops. It completely ignores "hai" and "bye" because it wasn't told to print them.

Key Rule: printf will only look at the 2nd, 3rd, or nth arguments if there is a corresponding % specifier in the 1st argument to "unlock" them.

๐Ÿ“š Navigate the Series

C Programming

Part 1 of 4

"Master C from the ground up! ๐Ÿš€ This series breaks down syntax, pointers, and memory management into clear, byte-sized guides. Perfect for beginners building a strong coding foundation."

Up next

Arithmetic Operators

C Programming Series - Article 2