Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: How is WASM Compiled? Are you using wasm_simd128 and msimd128 ? #189

Open
3 tasks done
vtempest opened this issue Sep 19, 2024 · 9 comments
Open
3 tasks done
Labels
enhancement New feature or request

Comments

@vtempest
Copy link

vtempest commented Sep 19, 2024

Describe what you are looking for

WASM support for SIMD discussion: emscripten-core/emscripten#12714
emscripten SIMD Docs https://emscripten.org/docs/porting/simd.html
wasm vs wam with msimd benchmark https://jeromewu.github.io/improving-performance-using-webassembly-simd-intrinsics/
demo code with wasm_simd https://github.com/jeromewu/wasm-perf/blob/main/mul_mats_intrin.c

Here's where the NGT algorithm, which is faster than HNSW, uses SIMD to optimize internally: https://github.com/yahoojapan/NGT/blob/1e44fffc2b95b211ff29ee693abb4a25057042d4/lib/NGT/Clustering.h#L224

Can you contribute to the implementation?

  • I can contribute

Is your feature request specific to a certain interface?

It applies to everything

Contact Details

No response

Is there an existing issue for this?

  • I have searched the existing issues

Code of Conduct

  • I agree to follow this project's Code of Conduct
@vtempest vtempest added the enhancement New feature or request label Sep 19, 2024
@ashvardanian
Copy link
Owner

Hi @vtempest! We don't currently compile to WASM, but it should be compatible with SimSIMD NEON kernels, I believe. What exactly are you looking for?

@vtempest
Copy link
Author

vtempest commented Sep 19, 2024

I would like to help build the fastest Simd accelerated vector search. If usearch uses simsimd and compiles to wasm, I'd love to improve upon it and build my vsearch fork with ram limited clusters.

@ashvardanian
Copy link
Owner

Yes, USearch compiles to WASM, but the whole ecosystem is currently fragmented, and it's not clear how to ship library dependencies for WASM. Still, it shouldn't be hard to integrate USearch directly into an arbitrary project that uses WASM, and then compile together as a monorepo. Have you tried that?

@vtempest
Copy link
Author

vtempest commented Sep 19, 2024

Yes I was working on top of the original hnswlib ported to wasm here: https://github.com/kaiobarb/hnswlib-wasm?tab=readme-ov-file
I am wondering how to integrate usearch instead as the base lib and then add to it the cluster splitting for RAM limits.

yahoojapan/NGT#168 (comment)
NGT inventor says my ram-limited clusters approach looks very promising. I'd like to integrate it into hnsw and usearch v3

@ashvardanian
Copy link
Owner

@vtempest, have you tried compiling SimSIMD into WASM already? I think the NEON backend should be compatible and can provide a huge boost for i8/u8/f16/bf16 vectors in any search engine, be it USearch or HNSW lib 🤗

@Sero1000
Copy link

Sero1000 commented Nov 27, 2024

Is a WASM build still needed ? I can take a look if so.

@ashvardanian
Copy link
Owner

@Sero1000, wouldn’t hurt, I assume. But it can be implemented in multiple ways - compiling the Rust SDK to WASM, JS, or Python through cibuildwheel + Pyodide. But before all that, we need the core C library to pass compilation 🤗

@vtempest
Copy link
Author

Would that approach (pyodide, etc) work in cloudflare workers?

Check out my usearch / vsearch demo. It has getEmbeddingModel,
convertTextToEmbedding,
addEmbeddingVectorsToIndex, and exports/imports the vector bin for a specific file to a base64 string saved with that file. this avoids the large ram need for scaling. There is a demo of common quotes and inference on a query.
https://github.com/vtempest/ai-research-agent/blob/53c952c885d5e34f0cab0baa638c78d9ad2e6f14/src/similarity/usearch.js#L9

I would like to make it work by not needing the fs with the load/save function. There should be a export/importBase64String function

@ashvardanian
Copy link
Owner

Would that approach (pyodide, etc) work in cloudflare workers?

@vtempest, I am not familiar and wouldn't have the time to investigate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants