Mixup: Beyond empirical risk minimization

AI-Generated Summary

Large deep neural networks are powerful, but exhibit undesirable behaviors such as memorization and sensitivity to adversarial examples. Our experiments on the ImageNet-2012, CIFAR-10, CIFAR-100, Google commands and UCI datasets show that mixup improves the generalization of advanced neural network architectures.

⚡ This is an original paraphrased summary — not copied from the abstract. Full paper available at the source link below.

Key Findings

1 In this work, we propose mixup, a simple learning principle to alleviate these issues.
2 In essence, mixup trains a neural network on convex combinations of pairs of examples and their labels.
3 By doing so, mixup regularizes the neural network to favor simple linear behavior in-between training examples.

Why It Matters

This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.

This summary is based on publicly available metadata and abstract. For the full research paper, visit the original source:

Read Full Paper at OpenAlex

More Artificial Intelligence Papers ← Back to Hub 📚 Learning Hub

Article Details

Source	OpenAlex
Category	🤖 Artificial Intelligence
Published	Jan 1, 2024
Journal	arXiv (Cornell University)
DOI	10.57702/dcy1c3gw
Citations	961
Authors	Hongyi Zhang