topic_modelling / requirements.txt

Commit History

Fixes on representation model, visualisations, and embeddings in CPU mode. Package updates and optimisation for compatibility
db3eaec

seanpedrickcase commited on

Corrected reference to sentence transformers dependency. Updated Dockerfile packages
611584f

seanpedrickcase commited on

Updated package versions in requirements files
5814ab0

seanpedrickcase commited on

Adjusted requirements for max available for Huggingface python==3.10 platform
6bf616b

seanpedrickcase commited on

Test update main requirements file for huggingface compatibility
9a4b420

seanpedrickcase commited on

Llama-cpp-python in GPU mode doesn't seem to work well with Bertopic on Huggingface, so downgrading that to CPU version
88d81fa

seanpedrickcase commited on

Rearranged functions for embeddings creation to be compatible with zero GPU space. Updated packages.
cc495e1

seanpedrickcase commited on

Added example of how to run function from command line. Updated packages. Embedding model default now smaller and at fp16.
34f1e83

seanpedrickcase commited on

Improved initial clean options. Now has option to return embeddings only.
89c4d20

seanpedrickcase commited on

App now retains original index following cleaning to allow for referring back to original data
90553eb

seanpedrickcase commited on

Allowed for app running on AWS to use smaller embedding model and not to load representation LLM (due to size restrictions).
22ca76e

seanpedrickcase commited on

Only aggregate topics not 'other', allowed for minimum sentence length, default max_topics now will auto aggregate topics. Added Cognito Auth functionality (boto3 with AWS).
1e2bb3e

seanpedrickcase commited on

Can split passages into sentences. Improved embedding, LLM representation models, improved zero shot capabilities
55f0ce3

seanpedrickcase commited on

Updated packages. Improve hierarchy vis. Better models - mixedbread and phi3. Now option to split texts into sentences before modelling.
04a15c5

seanpedrickcase commited on

Upgraded to Gradio 4.16.0. Guide for converting to exe added.
0a177ca

Sonnyjim commited on

Added clean data options, improved re-representation options and visualisation. General format changes
4effac0

Sonnyjim commited on

Lots of general fixes. New visualisations, fixed hierarchical vis for zero shot. Added calc all probabilities.
b4510a6

Sonnyjim commited on

first commit
9dbf344

Sonnyjim commited on