| --- |
| license: mit |
| language: |
| - en |
| tags: |
| - schema |
| - word-embeddings |
| - embeddings |
| - unsupervised-learning |
| - tables |
| - web-table |
| - schema-data |
| --- |
| # Pre-trained Web Table Embeddings |
|
|
| The models here represent schema terms and instance data terms in a semantic vector space making them especially useful for representing schema and class information as well as for ML tasks on tabular text data. |
|
|
| The code for executing and evaluating the models is located in the [table-embeddings Github repository](https://github.com/guenthermi/table-embeddings) |
|
|
| ## Quick Start |
|
|
| You can install the table_embeddings package to encode text from tables by running the following commands: |
| |
| |
| ```bash |
| pip install cython |
| pip install git+https://github.com/guenthermi/table-embeddings.git |
| ``` |
| |
| After that you can encode text with the following Python snippet: |
| |
| ```python |
| from table_embeddings import TableEmbeddingModel |
| model = TableEmbeddingModel.load_model('ddrg/web_table_embeddings_combo64') |
| embedding = model.get_header_vector('headline') |
| ``` |
| |
| ## Model Types |
| |
| | Model Type | Description | Download-Links | |
| | ---------- | ----------- | -------------- | |
| | W-tax | Model of relations between table header and table body | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_tax64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_tax150)) |
| | W-row | Model of row-wise relations in tables | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_row64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_row150)) |
| | W-combo | Model of row-wise relations and relations between table header and table body | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_combo64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_combo150)) |
| | W-plain | Model of row-wise relations in tables without pre-processing | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_plain64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_plain150)) |
| |
| ## More Information |
| |
| For examples on how to use the models, you can take a look at the [Github repository](https://github.com/guenthermi/table-embeddings) |
| |
| More information can be found in the paper [Pre-Trained Web Table Embeddings for Table Discovery](https://dl.acm.org/doi/10.1145/3464509.3464892) |
| ``` |
| @inproceedings{gunther2021pre, |
| title={Pre-Trained Web Table Embeddings for Table Discovery}, |
| author={G{\"u}nther, Michael and Thiele, Maik and Gonsior, Julius and Lehner, Wolfgang}, |
| booktitle={Fourth Workshop in Exploiting AI Techniques for Data Management}, |
| pages={24--31}, |
| year={2021} |
| } |
| ``` |