Announcing LaVague

Announcing LaVague

Introducing LaVague, an open-source Large Action Model framework to automate automation.

Daniel Huynh

TL;DR

Mithril Security resources will be poured into developing LaVague, our new automation framework leveraging LLMs.

We will continue in parallel our work on open-source enclaves for AI safety with the Future of Life Institute, a major think tank in AI safety. We will soon open-source a framework, AICert, to provide cryptographic proof of AI model provenance, aka what data and code were used to train a model. Other major projects will be announced in the future in that regard.

While we continue our open-source work on enclaves, a large part of our team will be dedicated to LaVague, which leverages LLMs to automate the writing of automation pipelines, beginning with Selenium.

You can find more on LaVague with:

- Our GitHub

- Our Discord 

- A Gradio demo to get started

The journey

Mithril Security started in 2021 in Paris, and started by open-sourcing BlindAI, an AI deployment framework leveraging Intel SGX secure hardware to deploy models on secure enclaves.

BlindAI enables the protection of both data and models, guaranteeing the privacy of data sent to an AI provider or the protection of the weights if deployed on-premise. 

BlindAI has been audited by Quarkslab in 2023 and was leveraged by the Future of Life Institute.

We have always been passionate about AI and privacy and believe in open-source for security, transparency, and trust. 

I will not bore you with the many frameworks we have developed to make AI more privacy-friendly, but if you care, you can also have a look at:

  • BastionLab: a remote data science framework with access control built-in
  • BlindBox: a framework to easily deploy Docker images inside Trusted Execution Environments
  • BlindLlama: a BlindBox v2, supported by the OpenAI Cybersecurity Grant program, to deploy Kubernetes image on Azure instances with vTPM
  • BlindChat: a framework to chat with local models fitting in your browser using transformers.js

We also shared our analysis of the LLM ecosystem, from the Total Cost of Ownership of AI models to hallucination detection through memorization of private data with LLMs. Our goal with these resources was to educate the market to help them be onboarded on AI in general and, forTotal Cost of Ownership of AI models to hallucination detection privacy-sensitive customers, leverage our confidential AI stack.

It was quite exciting to work on all this, but as a startup, we needed to find a product-market fit. 

It all started with a side project...

Before being CEO of Mithril Security, a privacy and security startup, I was an AI engineer by training and passion.

Since the rise of LLMs, I have been looking for occasions to explore its potential for a while, but due to my duties at Mithril, I have yet to be able to put in the time I wanted.

However, in early March 2024, I participated in a hackathon that featured LLMs for function calls. I wanted to win the Apple Vision Pro, so I put in some effort to develop a quick and dirty working demo. As I believe LLMs have the potential to automate mechanical tasks, like web browsing, I came up with a framework to automatically generate Selenium code to program a browser from natural language instructions.

I tried it, it worked, and voila! LaVague was born.

Because we at Mithril are firm believers in open-source, after the hackathon, I decided to open-source our project. I first announced it with an initial tweet, and it took off!

Original tweet

Following that, we managed to make #1 on Hacker News.

Hacker News post

Those events led to the explosive growth of our project:

GitHub star history

After seeing that much enthusiasm for LaVague and talking to early users, we realized that this project has a huge potential to help developers in their automation journey.

After a (very) short exchange with my team, we realized that the opportunity to democratize automation with AI was too exciting and thrilling not to do, so we have decided to broaden our mission focus and allocate Mithril's resources to make LaVague the new standard to automate automation!

🌊LaVague: A new wave is coming

That’s where LaVague comes in! 

LaVague is a Large Action Model framework whose goal is to automate automation. By leveraging LLMs under the hood, we make it easy to generate Selenium code to automate web interactions simply from human instructions.

You can see it in action below, where simple instructions are given to post on Hugging Face Social Posts:

You can play with it directly by using this Colab. You can also find our GitHub here.

Fun story: LaVague started as a hackathon project to win a Vision Pro in a local SF hackathon. While I unfortunately did not win the hackathon, I won much more than that: a Vision for automation!

We believe LLMs will not displace many people in the near future as they are not as flexible or intelligent as humans are and need to be for many jobs! However, with the proper engineering (prompt engineering, Chain of Thought, fine-tuning, etc.), they have great potential to help automate mundane tasks.

That is why our framework, LaVague, has an immense potential to empower human agents in their day-to-day tasks by letting an AI take care of the menial and mechanical tasks, like browsing a website for information or filling out forms. Instead, humans should focus on reasoning and planning and delegate the execution of mechanical tasks to machines.

Philosophy

Because we believe AI has the potential to impact our lives profoundly, such technology should be developed in the open.

That is why LaVague is an open-source framework, leveraging other open-source libraries, such as Hugging Face or LlamaIndex, under the hood. Because we want people to be able to have their own private LLMs to automate their tasks, LaVague natively supports both local and remote LLM calls to provide as much flexibility as possible.

Our key principle is that hackers hack for free. We want this to be a project by and for the AI community and beyond. All core components are developed openly, and we strive to guide this project to unlock the most value for the largest number.

Obviously, as a startup, we still need a monetization strategy. We have decided with LaVague to have a mix of open-core approaches. Users can use and modify LaVague at will, but some Enterprise features (security, compliance, audit, scalability, etc.) will be packaged and sold to the Enterprise market.

In addition, we will develop a hosted solution to make it easy for developers to get on board with LaVague easily.

Roadmap

So now, what is coming next to LaVague?

Our end goal is to automate automation and provide the ultimate tooling for developers to easily program pipelines to automate menial tasks.

Our first focus is to solve web automation. As most interactions happen on the internet today, providing an easy solution to interact with web resources could greatly help reduce time spent on menial tasks.

Therefore, the initial efforts will be to develop the best framework to generate web pipelines, focusing first on Selenium workflows. As Selenium is an industry-standard, it will be the first solution we support, though others, such as Playwright, will be integrated.

We aim within three months to have both:

  • Created a decentralized and open dataset of web interactions to evaluate and train LaVague to ensure it properly generates Selenium code
  • Have a model with 95% accuracy on a representative dataset of internet interactions.

Some non-exhaustive elements part of this roadmap:

  • Fine-tuning a Gemma 7b for a better local model
  • Improving the retriever to have the right precision/accuracy when asked to find the relevant HTML of the current page
  • Have a Hub of functions created by LaVague
  • Integrate other frameworks, such as Playwright or Selenium IDE, with a browser plugin

Conclusion

Mithril Security has seen a lot since its inception in 2021. Even though our initial focus on enclaves for AI has not borne the fruits we hoped for, it is still working steadily with partners like the Future of Life Institute to make AI confidentiality and transparency a reality!

We have started a new journey with LaVague, more focused on unlocking the full potential of AI to automate automation!

If you are interested in contributing, asking questions, or proposing features, please contact us on Discord! If you want professional support in your adoption of LaVague you can also email us directly.