9.0 Internet Measurement
This section covers the principles and techniques for measuring and understanding the Internet and other computer networks, focusing on core concepts and their implications rather than specific tool implementations or detailed techniques like sampling.
Key Concept Relationships and Comparative Analysis
Measurement Paradigms Comparison Table
Aspect | Passive Measurement | Active Measurement |
---|---|---|
Data Source | Existing network traffic | Artificially generated packets |
Network Impact | Zero additional load | Adds measurement traffic |
Use Case | Production traffic analysis | Specific property testing |
Information Type | Real user behavior | Controlled test conditions |
Examples | SNMP polling, flow monitoring | Ping, traceroute, bandwidth tests |
Accuracy | High for actual usage | High for specific metrics |
Privacy Concerns | Higher (real user data) | Lower (test data) |
Network Abstraction Levels Hierarchy
┌─────────────────────────────────────────┐
│ AS Level (Organizations) │ ← Policy, Economics
├─────────────────────────────────────────┤
│ Logical/Router Level (Routing) │ ← Protocol, Performance
├─────────────────────────────────────────┤
│ Physical Level (Infrastructure) │ ← Hardware, Geography
└─────────────────────────────────────────┘
↑ Increasing Abstraction ↑
Common Characteristics Across Levels: * All levels face measurement challenges (privacy, data scarcity, heterogeneity) * All require graph theory metrics for analysis * All contribute to overall network understanding
Key Differences: * Granularity: Physical (cables) → Router (hops) → AS (organizations) * Analysis Focus: Infrastructure → Performance → Policy * Measurement Complexity: Hardware monitoring → Protocol analysis → Economic modeling
1. Motivation and Importance of Internet Measurement
Core Motivations Framework
Internet Measurement Motivations
├── Technical Improvements
│ ├── Understanding network behavior
│ ├── Identifying and solving problems
│ └── Performance optimization
├── Operational Benefits
│ ├── Better network operations
│ ├── Pricing and billing accuracy
│ └── Security monitoring
└── Research & Innovation
├── Scientific discovery
├── Model creation for experiments
└── Identifying new phenomena
Societal Impact Dimensions: * Technical: Performance, resilience, security analysis * Economic: Pricing models, traffic discrimination * Political: State censorship, regulation compliance * Social: Social media impact, information spread patterns
2. Challenges in Internet Measurement
Challenge Categories and Interconnections
Design Challenges ←────────┐
├── Not designed for measurement │
├── Side-effect exploitation │ Fundamental
└── "Artful piling of hacks" │ Limitations
│
Data Challenges ←──────────────────┘
├── Lack of ground truth
├── Data scarcity and sensitivity
└── Privacy and ethical concerns
Network Challenges
├── Heterogeneity (devices, technologies, IoT)
├── Limited result generalizability
└── Temporal variability (failures, disasters)
Challenge Interdependencies: * Design → Data: Poor measurability design leads to data quality issues * Data → Network: Limited data availability compounds heterogeneity problems
Network → Design:* Dynamic changes require adaptive measurement approaches
3. Layers of Measurement
Network Stack Measurement Matrix
Layer | Focus Area | Measurement Targets | Performance Metrics |
---|---|---|---|
Higher (Layer 8+) | Social/Context | News propagation, social networks | Spread rate, influence metrics |
Transport (Layer 4) | Protocol Performance | TCP/UDP behavior, congestion | Throughput, reliability |
Network (Layer 3) | Routing & Topology | Path discovery, failure analysis | Latency, packet loss |
Physical (Layer 1-2) | Infrastructure | Cables, device connectivity | Bandwidth, RTT, availability |
Cross-Layer Dependencies: * Application performance requires analysis across all layers to identify bottlenecks * Physical connectivity heavily influences routing, bandwidth, delay, and RTT * Layers 4-5 downwards primarily focus on performance-related metrics
4. Network Structure (Graph Theory Metrics)
Graph Theory Foundations
A network can be viewed as a graph, consisting of nodes (devices) and edges (connections). Graph theory concepts can be applied to understand network structure.
Key Metrics Comparison and Calculation
Metric | Formula | Range | Interpretation | Use Case |
---|---|---|---|---|
Degree Distribution P(k) | P(k) = Nk / N | [0, 1] | Probability of k connections | Network resilience analysis |
Distance (Shortest Path) | Count of edges in shortest path | [1, ∞] | Communication efficiency | Latency prediction |
Diameter | max(distance(i, j)) for all pairs | [1, ∞] | Worst-case path length | Network span analysis |
Average Path Length | Σdistance(i, j) / total_pairs | [1, diameter] | Typical communication cost | Overall efficiency |
Clustering Coefficient | Ci = 2Ei / (ki(ki-1)) | [0, 1] | Local interconnectedness | Local efficiency |
Worked Example: Clustering Coefficient Calculation
Given Network:
Node A connects to: B, C, D
Node B connects to: A, C
Node C connects to: A, B, D
Node D connects to: A, C
For Node A: * Degree (kA) = 3 (connects to B, C, D) * Neighbors: B, C, D * Edges between neighbors (EA): B-C, C-D = 2 edges * CA = 2EA / (kA(kA-1)) = 2(2) / (3×2) = 4/6 = 0.67
Interpretation: Node A's neighbors are 67% interconnected, indicating good local connectivity.
Network Characteristics Patterns
Typical Large Network Properties: * Degree Distribution: Many nodes have low degree, few have very high degree (scale-free) * Path Properties: Short average path lengths despite large network size (small-world) * Clustering: Higher than random networks due to local community structures
5. Internet Topology Measurement (Levels)
Topology Abstraction Levels
Topology measurement aims to model the Internet accurately and observe trends in interconnectivity across different conceptual levels:
Internet Topology Hierarchy
AS Level (Autonomous Systems)
├── Focus: Inter-organizational connections
├── Granularity: ISPs, corporations, universities
├── Applications: Policy analysis, economic modeling
└── Dependencies: Relies on router-level information
↕ Information Flow ↕
Router/Logical Level
├── Focus: Information flow paths
├── Granularity: Individual routers and links
├── Applications: Performance optimization, routing
└── Dependencies: Relies on physical connectivity
↕ Information Flow ↕
Physical Level
├── Focus: Actual infrastructure
├── Granularity: Cables, devices, geographic locations
├── Applications: Reliability analysis, capacity planning
└── Dependencies: Hardware monitoring and geographic data
Level Comparison Matrix
Aspect | Physical Level | Router Level | AS Level |
---|---|---|---|
Entities | Cables, devices | Routers, logical links | Organizations, ISPs |
Analysis Focus | Infrastructure reliability | Protocol performance | Economic relationships |
Failure Impact | Component outages | Routing disruptions | Policy changes |
Measurement Tools | Hardware monitoring | Protocol analysis | BGP data, agreements |
Timescale | Hardware lifecycles | Protocol updates | Business relationships |
Key Principle: As you move up the layers (Physical → AS), you rely on information from lower layers and become less concerned with their specific details.
6. Types of Measurement
Measurement Approach Taxonomy
Network Measurement Approaches
Passive Measurement Active Measurement
├── Method: Observe existing traffic ├── Method: Generate test traffic
├── Impact: Zero network load ├── Impact: Adds measurement overhead
├── Data: Real user behavior ├── Data: Controlled test conditions
└── Applications: └── Applications:
├── Traffic analysis ├── Performance testing
├── Usage patterns ├── Topology discovery
├── Billing verification ├── Capacity planning
└── Security monitoring └── SLA verification
SNMP (Simple Network Management Protocol) Analysis
SNMP Framework: * Purpose: Network management services for data collection and configuration * Structure: Management Information Base (MIB) defines data organization * Operation: Router polling for counter data (bytes, packets) * Applications: Billing, anomaly detection, trend analysis
SNMP Capabilities vs Limitations:
Capability | Description | Limitation | Impact |
---|---|---|---|
Counter Collection | Byte/packet counts per interface | Aggregate Data Only | Cannot identify traffic types |
Periodic Polling | Regular data collection (e.g., 5 min) | Limited Granularity | May miss short-term events |
Anomaly Detection | Identify unusual traffic patterns | No Traffic Details | Cannot determine attack types |
Trend Analysis | Long-term usage patterns | No Source/Destination | Limited security analysis |
MRTG (Multi Router Traffic Grapher): * Visualizes SNMP data as time series plots * Effective for identifying bandwidth anomalies and trends * Can indicate security issues (e.g., DDoS attacks) through traffic spikes
7. Flow and Traffic Matrix Concepts
Flow Analysis Framework
To understand network behavior comprehensively, we need to analyze traffic flow patterns beyond simple SNMP counters.
Flow Definition and Characteristics
Flow: A unidirectional stream of packets between source and destination with specific identifying characteristics:
Flow Identification Parameters:
├── Network Layer (Layer 3)
│ ├── Source IP Address
│ ├── Destination IP Address
│ └── Protocol Type (TCP/UDP/ICMP)
├── Transport Layer (Layer 4)
│ ├── Source Port Number
│ └── Destination Port Number
└── Interface Information
└── Input Interface ID
Flow vs SNMP Comparison
Aspect | SNMP Counters | Flow Records |
---|---|---|
Granularity | Interface-level aggregates | Per-connection details |
Traffic Visibility | Total bytes/packets | Application-specific flows |
Analysis Capability | Bandwidth trends | Application breakdown |
Protocol Support | Interface-agnostic | Protocol-aware |
Security Analysis | Basic anomaly detection | Detailed attack analysis |
Flow Record Structure and Applications
Flow Record Components: * Identification: Source/destination IPs, ports, protocol * Counters: Byte and packet counts per flow * Protocol Data: TCP flags, ToS bits * Timing: First and last packet timestamps * Interface: Ingress interface information
Protocol-Specific Considerations: * TCP Flows: Connection-oriented, easier to track (clear start/end) * UDP Flows: Connectionless, may take different routes, harder to track
Traffic Matrix Construction
Traffic Matrix: Represents data transmission volumes between every pair of network nodes (subnets)
Traffic Matrix Example (simplified):
Destination Subnets
A B C D
Source A [0 15 8 3 ] Gbps
Subnets B [12 0 22 7 ]
C [5 18 0 11]
D [9 4 6 0 ]
Applications: * Application Breakdown: Identify traffic by service (web traffic via TCP port 80) * Flow Counting: Monitor active connections per application * Capacity Planning: Understand inter-subnet communication patterns * Security Analysis: Detect unusual communication patterns
Flow Aggregation Strategy
Purpose: Reduce data volume while preserving analytical value
Benefits: * Data Reduction: Minimize export data volume
Memory Efficiency: Lower router memory requirements * Reliability: Avoid packet loss during traffic spikes * Processing:* Router-side aggregation before analysis
Challenge: Building accurate traffic matrices remains a non-trivial problem requiring sophisticated aggregation and estimation techniques.
Exam Preparation: Key Calculation Examples
Example 1: Graph Theory Metrics Calculation
Scenario: Small network with 5 nodes (A, B, C, D, E)
Connections: * A connects to: B, C * B connects to: A, C, D
* C connects to: A, B, D, E * D connects to: B, C, E * E connects to: C, D
Calculate:
Degree Distribution P(k): - Degree 2: A (1 node) → P(2) = 1/5 = 0.2 - Degree 3: B, D, E (3 nodes) → P(3) = 3/5 = 0.6
- Degree 4: C (1 node) → P(4) = 1/5 = 0.2Clustering Coefficient for Node C: - Neighbors: A, B, D, E (degree = 4) - Edges between neighbors: A-B, B-D, D-E = 3 - CC = 2(3) / (4×3) = 6/12 = 0.5
Network Diameter: - Find all shortest paths, identify maximum - A to E: A→C→E (distance = 2) - All pairs have distance ≤ 2 - Diameter = 2
Example 2: Flow Analysis Scenario
Given: Router observes traffic for 1 hour with following flow data:
Flow | Src IP | Dst IP | Protocol | Src Port | Dst Port | Bytes | Packets |
---|---|---|---|---|---|---|---|
1 | 10.1.1.5 | 192.168.1.10 | TCP | 12345 | 80 | 150MB | 120, 000 |
2 | 10.1.1.8 | 192.168.1.10 | TCP | 23456 | 80 | 75MB | 60, 000 |
3 | 10.1.1.5 | 192.168.1.20 | UDP | 34567 | 53 | 2MB | 1, 500 |
Analysis Questions: 1. Application Breakdown: How much web traffic (port 80)? - Web traffic = Flow 1 + Flow 2 = 150MB + 75MB = 225MB
Flow Counting: How many active web connections? - TCP port 80 connections = 2 flows
Traffic Matrix (subnet level): - 10.1.1.0/24 → 192.168.1.0/24: 227MB total - DNS queries: 2MB (UDP port 53)
Key Insight: Flow analysis provides detailed application visibility that SNMP counters cannot deliver.