Need Help?

We are thrilled that one of our little Tanuki's is there to help you out! It's early days for local inference so we understand that you might have lots of questions.

We will be linking our blog posts in the coming weeks about working with Tanuki in detail soon but we will add the most common questions here.

If you have any further questions about your little Tanuki, feel free to ask questions here.
This is Miles the Tanuki, he's your little assistant, he won't wow the dinner party guests, but he will get your work done!
Hi! I'm Miles your little Tanuki friend. Let's get your sorted!

Frequently Asked Questions

Can Tanuki Run on Older Apple Silicon
Yes! Tanuki can run on every generation of M series Macs. Inference is a heavy load and different generations will have different capabilities. Tanuki App can run on any machine but its performance is bound to memory, both the amount and speed. Tanuki allows you to select your own models to suite your machine. Our blog includes articles on right sizing models for your use case.
Why use Tanuki and not...the others?
Tanuki is unique in many ways. Most importantly nothing that runs in Tanuki ever leaves your environment. You might set up Output Adapters to send your content to your own storage, but thats it! Once you have completed the first start widget and install your first model (and potentially add more) Tanuki can work without the internet! Your content is yours, we don't see it, we don't train on it, we can't use it against you.

While there are tasks that Tanuki cannot perform today, its perfect for things you do often (converting voice to notes), for summarising documents or drafting emails.

It has a lot of fun capabilities just like the hyper-scale alternatives. You can craft a template to transform your writing to sound like Dracula wrote it, or Homer Simpson.

And the best part is, once you have crafted your own Workflow Templates you can run them over your content. This is a great tool for field operators, doctors, lawyers, anyone who does things on repeat and doesnt want to have to describe it every time.

We are working on some really cool new workflows, things like realtime grammar support is coming this year, which we think is going to leave you wondering 'what other subscriptions can I cancel now'. If you do we'd love to hear about it!
Are Tokens Expensive?
Tokens cost nothing! This is the main difference between local inference and those giant cloud doohickey's. If you use Tanuki as a desktop tool once you purchase it thats all you pay. You can run Tanuki all day and there are no new charges. We have done load testing on Tanuki and discovered that using it all day every day will increase your annual electricity bill by something like $30-$50.
What about Intel Macs?
Sadly we cannot support Intel Macs, this isn't our Tanuki's being precious, MLX (the framework that runs Machine Learning models on Apple machines) can only support Apple Silicon. Again this isn't apple being precious, the performance required to run these models is only available through the unified memory architecture that makes new Macs so good to use.
Can I use different models?
Yes you can and you should! Different models do different things better, and it's not always obvious why until you play. Thats why Tanuki lets you download models from Hugging Face. Just drop the id into the Model Library with the 'add model' button and Tanuki will pull that model. You can then assign it to one of the available slots (fast and deep today, more to come). Some models may not produce output by default, this is where overrides may help.

A good example is log analysis, log analysis is difficult due to its token density but coder models often handle them better.

If you have specific needs get in touch with us here, we'd love to hear about what you want to do and how our Tanuki's can help!
What are Output Adapters?
In the first launch wizard you have the option of selecting a 'local storage adapter', this allows tanuki to push the output of a workflow into a location on your local machine that you have set. This is helpful locally if you want the files to forward, you can always interact with the files from inside tanuki, but adapters serve a different purpose, as 'local storage' is only one variant.

Other variants allow you to push to various types of file storage, either in the cloud or locally on your network. S3/minio, iCloud, Dropbox Google, Microsoft and generic

In config there is an 'output adapters' pane that allows you to enable more output adapters globally, once enabled and configured you can enable them either in workflow templates directly, or via the Workspace homepage.
What are Context Templates?
Thinking of Inference the input and other content that is passed into the Machine Learning model is called context, this includes the specific user input like a document to summarise, and other instructions. Tanuki assembles this context per workflow step based on the configuration of each steps context template. The context template has metadata that allows the user to set a series of parameters for that particular step in the workflow.

A single step might only want the input of the step before, or it may like all the steps input, or be selective, the context template lets the user make these configurations at various layers of the application.

The context templates include the prompt to the model that consumes the input and performs a task on it. That task could be cleaning up language, doing a check against organisational policy, or making your email sound like Dracula wrote it.

These templates are very similar to what you type into an online chatbot, the purpose is to reuse the same templates rather than making a new request in a chat window.
What are workflow templates?
Workflow templates allow a user to define an input (text, voice, documents, later vision) and then perform a series of jobs against that input. A doctor may want to convert a voice summary of a consult into bullet points that they then copy into their case management system.

A workflow with two jobs (each referencing a context template) could perform this task, first transcribing the audio, then secondly converting that into bullet points. That same doctor might add a third step to the workflow that extracts a set of actions or follow ups.

Finally, a workflow template also includes metadata to change how the model operates. When summarising a document or extracting action items the user may want to turn the Temperature down on the slot, reducing the creativity of the output. These knobs are optional, but available at every step.
What are overrides and how are they applied?
Overrides allow you to toggle the parameters of the model that your workflows are run through. There is a sophisticated stacking of overrides. At the platform level you can set the values for each 'slot', that runs a workflow step model. You can then override these values in both the Context Template and Workflow Template layer. Finally each surface you are working from includes an optional override block. These are stacked, if you change a field your change will be passed in to the model slot, all others will remain teh same.

With great power comes great responsibility, it is always possible to configure the model slot in such a way that the model crashes or runs to infinity, while working on new models be cautious, but don't worry, tokens are free!
How do I use the widget?
The purpose of the widget is to take a quick sample of your voice and inject that into whatever app you are working on. In the widget configuration page you can select a workflow and choose a shortcut command. You can launch the Widget then with the shortcut, this will open the widget and start a session with it. Then you can either click on the widget or press the shortcut again. Your words will be processed by the workflow and injected in the app you are currently working in.
How does compose work?
Compose is a scratch space that you can run a workflow over. This is useful for having Tanuki do things like draft up Emails. You can use the record button to drop text into the page or you can copy in text. You can then run the model over the content and see the change in the pane.

Compose has a chat panel that allows you to talk to Miles your Tanuki. While this is an evolving capability Miles is a little different to regular chat bots. Your full chat history lives within tanuki, so you are able to ask Miles questions that cloud based chat bots couldnt answer, like 'what types of cars did we talk about last january'.

Right now Miles cannot interact directly with the text you supply in the Compose window however in a coming release they will be able to make changes to your content based on chat.