Synerise Introduces Cleora.ai 2.0: Redefining Entity Representation Learning
Synerise is thrilled to announce the release of Cleora.ai 2.0, the latest version of our cutting-edge, open-source solution for scalable and efficient entity representation learning.
Cleora 2.0 builds on its predecessor's success, offering new features and optimizations that enable even broader use across diverse, relational data sets.
What is Cleora?
Cleora is a high-performance framework designed to generate stable, inductive representations of entities within heterogeneous data structures. By leveraging graph-based principles and optimized algorithms, Cleora allows data professionals to uncover hidden patterns and relationships, making it an invaluable tool for machine learning and data science applications.
The tool has earned international recognition in prestigious data science competitions, including:
- 1st place at SIGIR eCom Challenge 2020
- 2nd place and Best Paper Award at WSDM Booking.com Challenge 2021
- 2nd place at Twitter RecSys Challenge 2021
- 3rd place at KDD Cup 2021
These accomplishments highlight Cleora's versatility and its ability to deliver exceptional results in recommendation systems, graph analytics, and natural language processing.
Cleora is now available as a Python package pycleora. Key improvements compared to the previous version:
- Performance optimizations: ~10x faster embedding times
- Performance optimizations: significantly reduced memory usage
- Latest research: improved embedding quality
- New feature: can create graphs from Python iterators in addition to tsv files
- New feature: seamless integration with NumPy
- New feature: item attributes support via custom embedding initialization
- New feature: adjustable vector projection/normalization after each propagation step
Why Choose Cleora?
Unlike traditional graph embedding techniques, Cleora operates directly on relational data without requiring explicit graph construction. This not only reduces computational overhead but also eliminates the need for external dependencies. Key advantages include:
- Speed: Cleora can generate embeddings for millions of nodes in a matter of minutes, thanks to its optimized algorithms.
- Simplicity: The tool is straightforward to implement, with minimal setup and configuration.
- Inductive Capabilities: Cleora supports the induction of embeddings for new, unseen entities, making it ideal for dynamic, real-world applications.
- Versatility: Its use cases range from recommender systems to fraud detection, NLP, and customer behavior analysis.
Key usability features of Cleora embeddings
The technical properties described above imply good production-readiness of Cleora, which from the end-user perspective can be summarized as follows:
- heterogeneous relational tables can be embedded without any artificial data pre-processing
- mixed interaction + text datasets can be embedded with ease
- cold start problem for new entities is non-existent
- real-time updates of the embeddings do not require any separate solutions
- multi-view embeddings work out of the box
- temporal, incremental embeddings are stable out of the box, with no need for re-alignment, rotations, or other methods
- extremely large datasets are supported and can be embedded within seconds/minutes
Open-Source Commitment
Synerise remains committed to fostering innovation through open-source contributions. We value the feedback and support of the global data science community, which has played a crucial role in the evolution of Cleora.
To learn more, access Cleora.ai 2.0, or contribute to its development, visit our official GitHub repository.
Join us in redefining what's possible in entity representation learning with Cleora.ai 2.0.