Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
• 1908.10084 • Published
• 12
This is a sentence-transformers model finetuned from distilbert/distilbert-base-uncased on the askubuntu-questions dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 75, 'do_lower_case': False, 'architecture': 'DistilBertModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("tomaarsen/distilbert-base-uncased-askubuntu-ct")
# Run inference
sentences = [
'installing by using wubi on windows vista 64',
'some problems with the keyboard layout ?',
'how do i make nautilus windows stick for drag & drop ?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.1882, 0.2154],
# [0.1882, 1.0000, 0.1747],
# [0.2154, 0.1747, 1.0000]])
askubuntu-dev and askubuntu-testRerankingEvaluator with these parameters:{
"at_k": 10
}
| Metric | askubuntu-dev | askubuntu-test |
|---|---|---|
| map | 0.5172 | 0.5511 |
| mrr@10 | 0.6605 | 0.6833 |
| ndcg@10 | 0.5574 | 0.5978 |
text1 and text2| text1 | text2 | |
|---|---|---|
| type | string | string |
| details |
|
|
| text1 | text2 |
|---|---|
how to get the `` your battery is broken '' message to go away ? |
how to get the `` your battery is broken '' message to go away ? |
how can i set the software center to install software for non-root users ? |
limiting file access for a huge number of users |
what are some alternatives to upgrading without using the standard upgrade system ? |
how can change background of nautilus |
ContrastiveTensionLosseval_strategy: stepsper_device_train_batch_size: 16learning_rate: 2e-06num_train_epochs: 1optim: rmspropdo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 8gradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-06weight_decay: 0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: Nonewarmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Trueenable_jit_checkpoint: Falsesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseuse_cpu: Falseseed: 42data_seed: Nonebf16: Falsefp16: Falsebf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: -1ddp_backend: Nonedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonedisable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: rmspropoptim_args: Nonegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Truepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_for_metrics: []eval_do_concat_batches: Trueauto_find_batch_size: Falsefull_determinism: Falseddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueuse_cache: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss | askubuntu-dev_ndcg@10 | askubuntu-test_ndcg@10 |
|---|---|---|---|---|
| -1 | -1 | - | 0.5147 | 0.5170 |
| 0.0100 | 100 | 88.9905 | - | - |
| 0.0199 | 200 | 5.4677 | - | - |
| 0.0299 | 300 | 2.6312 | - | - |
| 0.0399 | 400 | 1.3858 | - | - |
| 0.0499 | 500 | 0.9374 | - | - |
| 0.0598 | 600 | 0.5592 | - | - |
| 0.0698 | 700 | 0.7174 | - | - |
| 0.0798 | 800 | 0.5705 | - | - |
| 0.0898 | 900 | 0.4788 | - | - |
| 0.0997 | 1000 | 0.3194 | 0.5407 | - |
| 0.1097 | 1100 | 0.2518 | - | - |
| 0.1197 | 1200 | 0.2657 | - | - |
| 0.1296 | 1300 | 0.2614 | - | - |
| 0.1396 | 1400 | 0.2060 | - | - |
| 0.1496 | 1500 | 0.1802 | - | - |
| 0.1596 | 1600 | 0.2680 | - | - |
| 0.1695 | 1700 | 0.2539 | - | - |
| 0.1795 | 1800 | 0.2850 | - | - |
| 0.1895 | 1900 | 0.2270 | - | - |
| 0.1995 | 2000 | 0.2129 | 0.5506 | - |
| 0.2094 | 2100 | 0.1698 | - | - |
| 0.2194 | 2200 | 0.2380 | - | - |
| 0.2294 | 2300 | 0.1907 | - | - |
| 0.2394 | 2400 | 0.3914 | - | - |
| 0.2493 | 2500 | 0.1575 | - | - |
| 0.2593 | 2600 | 0.1907 | - | - |
| 0.2693 | 2700 | 0.1080 | - | - |
| 0.2792 | 2800 | 0.1505 | - | - |
| 0.2892 | 2900 | 0.1195 | - | - |
| 0.2992 | 3000 | 0.0943 | 0.5573 | - |
| 0.3092 | 3100 | 0.1538 | - | - |
| 0.3191 | 3200 | 0.1044 | - | - |
| 0.3291 | 3300 | 0.2145 | - | - |
| 0.3391 | 3400 | 0.2781 | - | - |
| 0.3491 | 3500 | 0.1988 | - | - |
| 0.3590 | 3600 | 0.2708 | - | - |
| 0.3690 | 3700 | 0.1731 | - | - |
| 0.3790 | 3800 | 0.2764 | - | - |
| 0.3889 | 3900 | 0.1160 | - | - |
| 0.3989 | 4000 | 0.2061 | 0.5542 | - |
| 0.4089 | 4100 | 0.1619 | - | - |
| 0.4189 | 4200 | 0.1711 | - | - |
| 0.4288 | 4300 | 0.1330 | - | - |
| 0.4388 | 4400 | 0.1505 | - | - |
| 0.4488 | 4500 | 0.1210 | - | - |
| 0.4588 | 4600 | 0.1164 | - | - |
| 0.4687 | 4700 | 0.1653 | - | - |
| 0.4787 | 4800 | 0.1489 | - | - |
| 0.4887 | 4900 | 0.0486 | - | - |
| 0.4987 | 5000 | 0.1202 | 0.5589 | - |
| 0.5086 | 5100 | 0.1503 | - | - |
| 0.5186 | 5200 | 0.0976 | - | - |
| 0.5286 | 5300 | 0.0675 | - | - |
| 0.5385 | 5400 | 0.0918 | - | - |
| 0.5485 | 5500 | 0.2239 | - | - |
| 0.5585 | 5600 | 0.1034 | - | - |
| 0.5685 | 5700 | 0.1660 | - | - |
| 0.5784 | 5800 | 0.1669 | - | - |
| 0.5884 | 5900 | 0.0716 | - | - |
| 0.5984 | 6000 | 0.3106 | 0.5616 | - |
| 0.6084 | 6100 | 0.1240 | - | - |
| 0.6183 | 6200 | 0.1670 | - | - |
| 0.6283 | 6300 | 0.2198 | - | - |
| 0.6383 | 6400 | 0.1169 | - | - |
| 0.6482 | 6500 | 0.1376 | - | - |
| 0.6582 | 6600 | 0.2339 | - | - |
| 0.6682 | 6700 | 0.1729 | - | - |
| 0.6782 | 6800 | 0.0491 | - | - |
| 0.6881 | 6900 | 0.1400 | - | - |
| 0.6981 | 7000 | 0.0688 | 0.5660 | - |
| 0.7081 | 7100 | 0.2194 | - | - |
| 0.7181 | 7200 | 0.1351 | - | - |
| 0.7280 | 7300 | 0.0832 | - | - |
| 0.7380 | 7400 | 0.1015 | - | - |
| 0.7480 | 7500 | 0.0390 | - | - |
| 0.7580 | 7600 | 0.2088 | - | - |
| 0.7679 | 7700 | 0.0888 | - | - |
| 0.7779 | 7800 | 0.2217 | - | - |
| 0.7879 | 7900 | 0.1913 | - | - |
| 0.7978 | 8000 | 0.0557 | 0.5582 | - |
| 0.8078 | 8100 | 0.0986 | - | - |
| 0.8178 | 8200 | 0.1408 | - | - |
| 0.8278 | 8300 | 0.0744 | - | - |
| 0.8377 | 8400 | 0.1375 | - | - |
| 0.8477 | 8500 | 0.0746 | - | - |
| 0.8577 | 8600 | 0.0734 | - | - |
| 0.8677 | 8700 | 0.0827 | - | - |
| 0.8776 | 8800 | 0.1275 | - | - |
| 0.8876 | 8900 | 0.1072 | - | - |
| 0.8976 | 9000 | 0.1975 | 0.5577 | - |
| 0.9075 | 9100 | 0.0408 | - | - |
| 0.9175 | 9200 | 0.0584 | - | - |
| 0.9275 | 9300 | 0.2589 | - | - |
| 0.9375 | 9400 | 0.0503 | - | - |
| 0.9474 | 9500 | 0.1529 | - | - |
| 0.9574 | 9600 | 0.0840 | - | - |
| 0.9674 | 9700 | 0.2059 | - | - |
| 0.9774 | 9800 | 0.0634 | - | - |
| 0.9873 | 9900 | 0.0837 | - | - |
| 0.9973 | 10000 | 0.1010 | 0.5574 | - |
| -1 | -1 | - | 0.5574 | 0.5978 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@inproceedings{carlsson2021semantic,
title={Semantic Re-tuning with Contrastive Tension},
author={Fredrik Carlsson and Amaru Cuba Gyllensten and Evangelia Gogoulou and Erik Ylip{"a}{"a} Hellqvist and Magnus Sahlgren},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=Ov_sMNau-PF}
}
Base model
distilbert/distilbert-base-uncased