170 points by liadyo 1 day ago | 51 comments
kiitos 1 day ago
What does this mean? How does it work? How can I understand how it works? The requirements, limitations, constraints? The landing page tells me nothing! Worse, it doesn't have any links or suggestions as to how I could possibly learn how it works.
> Congratulations! The chosen GitHub project is now fully accessible to your AI.
What does this mean??
> GitMCP serves as a bridge between your GitHub repository's documentation and AI assistants by implementing the Model Context Protocol (MCP). When an AI assistant requires information from your repository, it sends a request to GitMCP. GitMCP retrieves the relevant content and provides semantic search capabilities, ensuring efficient and accurate information delivery.
MCP is a protocol that defines a number of concrete resource types (tools, prompts, etc.) -- each of which have very specific behaviors, semantics, etc. -- and none of which are identified by this project's documentation as what it actually implements!
Specifically what aspects of the MCP are you proxying here? Specifically how do you parse a repo's data and transform it into whatever MCP resources you're supporting? I looked for this information and found it nowhere?
broodbucket 23 hours ago
Or maybe I'm so out of the loop it's as obvious as "git" is, I dunno.
fragmede 11 hours ago
Threads like this work better when they can go deeper without rehashing the basics every time.
matthewdgreen 37 minutes ago
johannes1234321 8 hours ago
T3RMINATED 10 hours ago
sdesol 15 hours ago
kiitos 10 hours ago
john2x 13 hours ago
sdesol 6 hours ago
- Identify the files that should be put into context since tokens cost money and I wanted to use a model that was capable like Sonnet, which is expensive.
- There were 35 messages (minus 2 based on how my system works) so I wrote and read quite a bit. I was actually curious to know how it worked since I have domain knowledge in this area.
- Once I knew I had enough context in the messages, I switched to Gemini since it was MUCH cheaper and it could use the output from Sonnet to guide it. I was also confident the output was accurate since I know what would be required to put a Git repo into context and it isn't easy if cost, time and accuracy is important.
Once I went through all of that I figured posting the parent questions would be a good way to summarize the tool, since it was very specific.
So I guess if that is the next LMGTFY, then what I did was surely more expensive and timeconsuming.
sivaragavan 6 hours ago
FWIW, this project creates two tools for a GitHub repo on demand
fetch_cosmos_sdk_documentation
search_cosmos_sdk_documentation
These tools would be available for the MCP client to call when it needs information. The search tool didn't quite work for me, but the fetch did. It pulled the readme and made it available to the MCP client. Like I said before, it's not so helpful at the moment. But I am interested in the possibilities.sdesol 6 hours ago
I think using MCP is an interesting idea, but the heavy lifting that can provide insights, is not with MCP. For fetch and search to work effectively, the MCP will need quality context to know what to consider. I'm biased, but I really looked into chunking documents, but given how the LLM landscape is evolving, I don't think chunking makes a lot sense any more (for code at least).
I've committed to generating short and long overviews for directories and files. Short overviews are two to three sentences. And long overviews are two to three paragraphs. Given how effectively newer LLMs can process 100,000 tokens or less, you can feed it a short overview for all files/directories to determine what files to sub query with. That is, what long overviews to load into context for the sub query.
I also believe most projects in the future will start to produce READMEs for LLMs that are verbose and not easy to grok for humans, but is rich in detail for LLMs. You may not want the LLM to generate the code for you, but the LLM can certainly help us navigate complex/unfamiliar code in a semantic manner, which can be game changer for onboarding.
ianpurton 16 hours ago
1. Some LLMs support function calling. That means they are given a list of tools with descriptions of those tools.
2. Rather than answering your question in one go, the LLM can say it wants to call a function.
3. Your client (developer tool etc) will call that function and pass the results to the LLM.
4. The LLM will continue and either complete the conversation or call more tools (functions)
5. MCP is gaining traction as a standard way of adding tools/functions to LLMs.
GitMCP
I haven't looked too deeply but I can guess.
1. Will have a bunch of API endpoints that the LLM can call to look at your code. probably stuff like, get_file, get_folder etc.
2. When you ask the LLM for example "Tell me how to add observability to the code", the LLM can make calls to get the code and start to look at it.
3. The LLM can keep on making calls to GitMCP until it has enough context to answer the question.
Hope this helps.
sandbags 5 hours ago
Is it just me or is MCP a really bad idea?
We seem to have spent the last 10 years trying to make computing more secure and now people are using node & npx - tools with a less than flawless safety story - to install tools and make them available to a black box LLM that they trust to be non-harmful. On what basis, even about accidental harm I am not sure.
I am not sure if horrified is the right word.
liadyo 1 day ago
nlawalker 24 hours ago
What does etc include? Does this operate on a single content file from the specified GitHub repo?