An ultraproduct is a construction method of model theory to create a model for a theory in some language (see this or this if you're unsure what "model" means in the context of (modern) mathematical logic) out of an infinite collection of models for that theory in that language. Taking a cartesian product of models in "the obvious way" fails to work; an ultraproduct divides by some equivalence relation to get a new model. Unfortunately, it turns out that the equivalence relation requires the axiom of choice, so it's NOT a "construction".
Suppose we have a collection {Mi}i∈I of models, for some index set I. Let U be an ultrafilter on I. Call a subset A⊆I "most of" I if A∈U. This is not standard terminology! Note that due to the maximality of the ultrafilter U, either A∈U or (I\A)∈U.
We want a new model M which is the ultraproduct of the Mi's over U. Take the cartesian product of the worlds of all the Mi. Define an equivalence relation "~": For elements a=(ai)i∈I and b=(bi)i∈I of the product,
a ~ b ⇔ {i∈I : ai=bi} ∈ U.
That is, a is equivalent to b iff at
most of the indices i∈I they are equal. The world of our model M is the set of
equivalence classes of ~. By
abuse of notation we might say that
M = (∏i∈I Mi) / ~
Once you get a hang of what we're doing, the rest is pretty obvious... For every k-place relation symbol R, we define that
R(a1,...,ak) ⇔ {i∈I : R(a1i, ..., aki) } ∈ U.
("Relation R holds between elements of the ultraproduct if it holds on almost all their coordinates"). Note that this definition of R is
well defined: If we pick different representatives of the equivalence classes of the a
j's, they are equal at almost all coordinates, so (as U is closed to
finite intersections of elements!) the same truth value is given to R.
For every k-place function symbol "f", we define
f(a1,...,ak) = (f(a1i,...,aki))i∈I.
("the value of f in the ultraproduct at the i'th coordinate is its value in the i'th model"). Again note that this is well defined, again due to finite intersections of elements of U being elements of U. And constants are really just 0-place functions, so we've covered those, too.
Unfortunately, yes. Beyond the minor problems of ensuring that the product is non-empty and determining a set of equivalence classes, you really need U to be an ultrafilter:
- The "upwards closure" property (any subset of I which includes a member of U is a member of U) is needed for ~ to be an equivalence relation.
- The "finite intersection" property (any intersection of finitely many members of U is a member of U) is needed for relations and functions to be well-defined.
- The maximality property (U is an ultrafilter, i.e. U is maximal, i.e. for any A⊆I, either A∈U or (I\A)∈U) is needed to give a truth value to every relation.
The simplest ultrafilters, which also don't require the axiom of choice, are principal ultrafilters: U=Ux={A⊆I : x∈A}, for some x∈I. But if we take Ux in the ultraproduct, the properties of the elements are determined solely by their properties at coordinate x. So the model we get is (isomorphic to) Mx. We don't need an ultraproduct to do that, we've already got Mx! We might therefore do well to call an ultraproduct resulting from a principal ultrafilter a trivial ultraproduct.
If I is finite, we only have principal ultrafilters. So the ultraproduct construction is uninteresting for multiplying together finitely many models: it gives us one of the existing models.
When U is a nonprincipal ultrafilter (and therefore I must be infinite), the ultraproduct is interesting. Unfortunately we've no idea what a general element of U looks like, so the ultraproduct will be very mysterious. But we can take some filter F and extend it to an ultrafilter F⊆U. For instance, we can take F={A∈I : I\A is finite}. This ensures that a property of any finite set of models cannot determine the property on the ultraproduct. To get such a nontrivial ultrafilter, we need to use some Choice.
What is it good for
The use of ultraproducts comes from Los' theorem: A proposition P in the first-order language of the models is true in the ultraproduct model iff it is true in most of the models. Proving Los' theorem is actually an easy induction on the structure of P. It's the definition of the ultraproduct that's hard!
It is essentially the compactness theorem of first order logic, but applied to a pre-specified collection of models. And indeed, many uses of the one can be performed with the other as well.
There's also a parallel to non-standard analysis. If the models are all sets (and we can usually take them in some internal model of set theory, so they will be sets), then an infinite collection of models will have non-standard pseudo-elements. An ultraproduct has the same properties as a pseudo-element of the collection; in fact, if a suitable expansion of the standard world into the nonstandard world is performed, then it will be a pseudo-element of the collection. (However, different ultraproducts may require different expansions, while non-standard analysis gives a huge amount of pseudo-elements within the same expansion. There is still some difference between the concepts.)
See an example of an ultraproduct for how to actually make ultraproducts work for you, as well as a proof of a theorem or two...