Graph Neural Networks: Theory, Architectures, and Applications

FREE
advancedv1.0.0tokenshrink-v2
GNNs extend DL to non-Euclidean graph-structured data, enabling representation learning on nodes, edges, and graphs. Rooted in spectral graph theory & spatial message-passing paradigms. Fundamental assumption: node reps depend on features of self & neighbors via iterative aggregation. Key components: feature matrix X ∈ ℝ^(N×D), adjacency A ∈ {0,1}^(N×N), learnable params Θ. Spectral GNNs use graph Fourier basis: L = D−A → eigendecomp UΛUᵀ → spectral filters g_θ(Λ) applied in Fourier domain → x_out = U g_θ(Λ) Uᵀ x. Limitations: non-local, graph-specific filters, high compute. ChebNet approximates g_θ via K-th order Chebyshev polys: g_θ ≈ Σ_k=0^K θ_k T_k(L̃), L̃ = 2L/λ_max − I, enabling K-hop receptive fields & spatial interpretation. GCN (Graph Convolutional Network) simplifies to 1st-order linear approx: X' = σ(ÂXW), Â = D̂⁻½ÂD̂⁻½, Â=A+I, D̂_ii=Σ_j Â_ij. Enables fast, transductive learning but limited expressivity (oversmoothing, fixed filter weights). Spatial GNNs: aggregate neighbor info directly: h_v^(k) = UPDATE(h_v^(k−1), AGG({h_u^(k−1) | u∈N(v)})). MPNN unifies framework: MESSAGE(m_e), AGGREGATE, UPDATE via GRU/LSTM/MLP. GAT introduces attn: α_uv = softmax(LeakyReLU(aᵀ[W h_u, W h_v])), enabling adaptive neighbor weighting. Edge features incorporated via e_uv in attn or message. GraphSAGE: samples fixed-size neighborhoods, enabling inductive learning on large, evolving graphs. Uses mean/pooling/LSTM aggregators. GIN: maximizes discriminative power via injective multiset func: h_v^(k) = MLP( (1+ε) h_v^(k−1) + Σ_{u∈N(v)} h_u^(k−1) ). Proven to be as powerful as WL graph isomorphism test. Hierarchical GNNs: coarsen graphs via pooling: DiffPool learns soft cluster assignments S ∈ ℝ^(N×C), Z = SᵀX, A' = SᵀAS. MinCutPool uses spectral clustering priors. TopKPool, SAGPool apply node scoring & selection. Readout funcs: sum/max/mean pooling, or Set2Set (LSTM-based seq enc), or hierarchical attn (e.g., PoolingGAT). Training: end-to-end via BP. Loss varies: cross-entropy (node/graph class), MSE (regression), contrastive (graph sim), triplet (ranking). Negative sampling critical for link pred. Advanced archs: GCNII adds initial residual & identity mapping for deep GCNs: H^(k) = σ( (1−α_k) H^(0) W_k^(0) + α_k H^(k−1) W_k^(1) ) ∘ (I + β_k L̂). GPR-GNN learns full spectral filter via Generalized PageRank coeffs. SIGN precomputes k-step diffusions: Z_k = ReLU(Â^k X W_k), then concatenates for MLP. Avoids deep stacking. Heterogeneous GNNs: handle multiple node/edge types. R-GCN: W_r per relation, h_v = W_0 h_v + Σ_{r∈R} Σ_{u∈N_r(v)} (1/|N_r(v)|) W_r h_u. CompGCN jointly embeds nodes & relations in knowledge graphs. Temporal GNNs: DySAT (temporal attn over snapshots), TGN (temporal message passing with memory), EvolveGCN (RNN-rolled GCN params). Applications: node class (citation nets: Cora, Citeseer), link pred (social networks, KG completion), graph class (molecular tox: Tox21, QM9), recsys (PinSage: item-to-item graphs), traffic (graph of sensors), NLP (AMR parsing, syntax-aware MT). SoTA: Graphormer (uses shortest path dist as attn bias, structural encodings), GPS (SE(3)-equivariant + attn), GraphMAE (masked attn autoenc for SSL). Limitations: oversquashing (long paths compressed), over-smoothing (repeated agg → indistinct reps), scalability (dense A → O(N²)), dynamic graph support. Mitigations: jump/k-jump connections, virtual nodes, positional encodings, subgraph sampling (e.g., ClusterGCN, GraphSAINT). Evaluation: standard splits (fixed/train-val-test), metric: Acc/F1/AUC/ROC, MAE/RMSE. Pitfalls: data leakage (via A), improper val/test masking, ignoring graph sparsity, ignoring directionality, inadequate hyperparam tuning (layers, dropout, lr). Best practices: 2–4 layers (deeper ≠ better), dropout on X/A, L2 reg, batch norm, Adam opt, early stopping. Benchmark datasets: Planetoid (Cora), OGB (OGBN-Arxiv), TUDataset (PROTEINS). Frameworks: PyG, DGL. Future: integration with LLMs, causal GNNs, foundation models on graphs, neuro-symbolic reasoning.
1.0K
tokens
13.0%
savings
Downloads0