A corpus is a searchable database of language samples for linguistic research. A corpus may be based on written or spoken language. Some corpora are tagged or annotated by part of speech; other corpora are plain text.
PHOIBLE Repository of cross-linguistic phonological inventory data compiled into a single searchable convenience sample. Includes phoneme inventories and distinctive feature data for every phoneme in every language.