How to run graphdb on local

Python venv環境を立ち上げる

python -m venv venv
Python

microsoft/graphragをclone

git clone https://github.com/microsoft/graphrag.git
Python

version v0.9.0を利用

cd graphrag
git checkout v0.9.0
Python

pyproject.tomlへの追記

fastapi = "^0.115.0"
uvicorn = "^0.31.0"
asyncio = "^3.4.3"
utils = "^1.0.2"
Python

package を install

poetry install
Python

graphragをfastapiとしてdeployするように、main.pyとsearch機能を用意

git clone https://github.com/brightwang/graphrag-dify.git
Python

main.pyをRoot pathにcopy

search.pyとsearch_prompt.pyをgraphrag/graphrag/query/structured_search / local_searchにcopy

knowledge databaseを初期化

mkdir knowledge
poetry run poe init --root ./knowledge/01_denodo
Python

knowledgeの直下に、01_denodoが作成されたので、その直下にinput folderを作成し、txtのknowledgeを入れる

setting.ymlを修正 openai利用からazure openai利用に変更

llm:
  type: azure_openai_chat
  api_base: ${AZURE_OPENAI_API_BASE}
  api_version: ${AZURE_OPENAI_CHAT_API_VERSION}
  auth_type: api_key # or azure_managed_identity
  api_key: ${AZURE_OPENAI_CHAT_API_KEY}
  # audience: "https://cognitiveservices.azure.com/.default"
  # organization: <organization_id>
  model: ${AZURE_OPENAI_CHAT_MODEL}
  deployment_name: ${AZURE_OPENAI_CHAT_DEPLOYMENT_NAME}
  # encoding_model: cl100k_base # automatically set by tiktoken if left undefined
  model_supports_json: true # recommended if this is available for your model.
  concurrent_requests: 25 # max number of simultaneous LLM requests allowed
  async_mode: threaded # or asyncio
  retry_strategy: native
  max_retries: 10



embeddings:
  llm:
    type: azure_openai_embedding
    api_base: ${AZURE_OPENAI_API_BASE}
    api_version: ${AZURE_OPENAI_EMBEDDING_API_VERSION}
    auth_type: api_key # or azure_managed_identity
    api_key: ${AZURE_OPENAI_EMBEDDING_API_KEY}
    # audience: "https://cognitiveservices.azure.com/.default"
    # organization: <organization_id>
    model: ${AZURE_OPENAI_EMBEDDING_MODEL}
    deployment_name: ${AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME}
    # encoding_model: cl100k_base # automatically set by tiktoken if left undefined
    model_supports_json: true # recommended if this is available for your model.
    concurrent_requests: 25 # max number of simultaneous LLM requests allowed
    async_mode: threaded # or asyncio
    retry_strategy: native
    max_retries: 10

### Input settings ###
Python

.envを修正

AZURE_OPENAI_API_BASE="https://xxx.openai.azure.com/"

# Chat Model
AZURE_OPENAI_CHAT_API_VERSION="2024-12-01-preview"
AZURE_OPENAI_CHAT_API_KEY="<your-chat-api-key>"
AZURE_OPENAI_CHAT_MODEL="gpt-4o"
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME="gpt-4o"

# Embedding Model
AZURE_OPENAI_EMBEDDING_API_VERSION="2024-02-01"
AZURE_OPENAI_EMBEDDING_API_KEY="<your-embedding-api-key>"
AZURE_OPENAI_EMBEDDING_MODEL="text-embedding-3-large"
AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME="text-embedding-3-large"
Python

create index

 poetry run poe index --root ./knowledge/01_denodo
Python

test the index

poetry run poe query --root ./knowledge/01_denodo --method local --query "what data in o_ads?"                                
Python

发表评论

您的邮箱地址不会被公开。 必填项已用 * 标注