Chat botにReActパターンを導入して推論力を強化してみた！

前回のおさらいと今回やること
AIエージェント設計パターン
1. ReAct（Reason + Act）
ReActの応用
1. 実装（組み込み）
2. 完成
おわりに
今後の課題
参考URL

前回のおさらいと今回やること

近年、大規模言語モデルを活用したAIエージェントを業務に導入することで、生産性の向上を目指す動きが活発になっています。弊社でも、すでにDevinの導入が始まっていることをご存じの方も多いかと思います。（近々Devin勉強会を社内で行う予定です）

しかし、全エンジニアがDevinを利用するとなると、非常に高額な経費が必要になるのが現実です。そこで、代替となるようなAIエージェントを開発することを目指し、まずはAnthropic APIを用いてClaudeモデルベースのChat botを作成してみました。

このChat botの作成には、Webアプリケーションフレームワークとして Streamlit を、そして大規模言語モデルの活用を容易にするライブラリとして LangChain を用いました。基本的にはこの2つのライブラリのみで、シンプルなChat botの構築は可能です。

ただし、DevinやOpenHandsのように「内部でコードを実行し、デバッグを自動で行う」ような機能を再現するには、どのような処理が必要なのでしょうか。

今回は、そのような完全自立型AIエージェントの実現に向けて、どのようにChat botに実行・検証能力を持たせられるか、特に「ReActパターン」の導入を通して考察していきます。

AIエージェント設計パターン

Devinのような「実行・デバッグ機能」を再現する方法を探る中で、AIエージェントの設計パターンをまとめた記事を見つけました。

AIエージェントをより高度に活用し、複雑なタスクの処理や効率化、出力品質の向上を実現するためには、以下の6つの設計パターンがあるとされています。

ReAct (Reason+Act)
Self-Reflective（内省）エージェント
Tree-of-Thought (ToT)
Plan & Execute
Human-in-the-Loop
RL（強化学習）ベース or 自己学習エージェント

それぞれに特徴とメリットがあり、目的に応じて使い分ける必要があります。

中でもReActパターンは、モデルが思考（Reason）と行動（Act）を交互に繰り返す構造となっており、実行中に理由付けを挟むことで、ミスの防止や品質向上に有効であるとされていました。

そのため、今回はこのReActパターンをChat botに導入してみようと思います。

ReAct（Reason + Act）

今回実装する ReActパターン は、「Reason（理由付け）」、「Act（行動）」、「Observation（観察）」という3つのステップを繰り返すことで問題解決を行う手法です。

この一連のプロセスを通じて、入力されたプロンプトを段階的に深掘りし、必要に応じて外部の情報源も取得することで、より複雑なタスクにも対応できるようになります。

例えば、プリンを食べる場合はざっくりとこのように考えることができます。

・Reason : プリンを食べるためにフィルムを剥がす必要がある
・Act : フィルムを剥がす
・Observation : 食べられる状態になる

このように、タスクを論理的に分解することで、モデルはより正確かつ適切な出力を生成できるようになるんだそうです。

さらに、外部情報を取得するために Web検索API を活用することで、学習データに含まれていない情報に対しても柔軟に対応できます。引用している論文ではWikipedia APIを使用していることが明記されていました。

ReAct: Synergizing Reasoning and Acting in Language Models

While large language models (LLMs) have demonstrated impressive capabilities across tasks in language understanding and interactive decision making, their abili...

参考にした記事の中で以下のようなReActのサンプルコードを紹介していたので、これを前回のChat botに組み込んでみましょう。

from typing import Any

def reason_and_act(llm_call, user_input: str, tools: dict) -> str:
    """
    簡単な例: Reason(思考)の出力を解析し、
    'search:' や 'calc:' といった指示があれば該当ツールを呼び出し、
    その観察結果(Observation)を次の推論に反映させる。
    """

    reasoning_log = ""
    observations_log = []
    answer = ""

    while True:
        # --- Step1: Reason（推論） ---
        # 過去の推論と観察情報を踏まえて、次に何をすべきかを考える
        prompt = f"""
        You are an agent that alternates between Reason, Act, and Observation.
        Current reasoning log:
        {reasoning_log}

        Current observations:
        {observations_log}

        Next action or final answer for the user input:
        {user_input}
        """
        response = llm_call(prompt)

        # ここでは "Reason:" と "Act:" を分割してパース（実装は省略）
        reason_part, act_part = parse_reason_act(response)

        # 推論部分をログに追加
        reasoning_log += f"\nReason: {reason_part}"

        # --- Step2: Act（行動） ---
        # "Act"にツール呼び出しがあるか、最終回答があるかをチェック
        if act_part.startswith("search:"):
            query = act_part[len("search:"):].strip()
            # ツールを呼び出して結果を取得
            search_result = tools["search"](query)
            # --- Step3: Observation（観察） ---
            # 行動の結果を次のループの推論に反映
            observations_log.append(f"Search result: {search_result}")

        elif act_part.startswith("calc:"):
            expr = act_part[len("calc:"):].strip()
            calc_result = tools["calc"](expr)
            # 行動の結果をObservationとして記録
            observations_log.append(f"Calc result: {calc_result}")

        else:
            # "Act"が最終回答（FINISH）とみなし、ループを抜ける
            answer = act_part
            break

    return answer

ReActの応用

今回紹介したReActパターンを、前回作成したChat botに組み込むことができれば、Devin や OpenHandsのような、思考と実行を繰り返すAIエージェントの挙動をある程度再現できると考えました。

実装（組み込み）

前述のReActパターンのステップに従い、これを自作のChat botに組み込んでみます。

このChat botでは、プロンプトの14行目で応答スタイルを指定できるようになっており、たとえば「絶対に関西弁で返答してください」のように、話し方や出力形式の制御が可能です。

ReActの仕組みを取り入れるにあたり、reasoning_logとobservation_logを参照しながら、次に取るべき行動を判断するような構成にしました。

これにより、モデルが前回の思考・行動・観察結果を踏まえて自己完結的に意思決定を行うことが期待できます。

def reason_and_act(llm_call, user_input: str, tools: dict, max_iterations: int = 5) -> str:
    """
    ReActパターンの実装: Reason(思考)の出力を解析し、
    'search:' や 'calc:' といった指示があれば該当ツールを呼び出し、
    その観察結果(Observation)を次の推論に反映させる。
    """
    reasoning_log = ""
    observations_log = []
    answer = ""
    
    for iteration in range(max_iterations):
        # --- Step1: Reason（推論） ---
        # 過去の推論と観察情報を踏まえて、次に何をすべきかを考える
        prompt = f"""
You are an agent that uses the ReAct pattern (Reason-Act-Observe).
Please respond in the following format:

Reason: [Your reasoning about what to do next]
Act: [Your action - either use a tool or provide final answer]

Available tools:
- search: [query] - Search for information
- calc: [expression] - Calculate mathematical expressions
- FINISH: [final answer] - Provide the final answer

Current reasoning log:
{reasoning_log}

Current observations:
{' '.join(observations_log)}

User input: {user_input}

Please provide your reasoning and action:
        """
        
        response = llm_call(prompt)
        
        # "Reason:" と "Act:" を分割してパース
        reason_part, act_part = parse_reason_act(response)
        
        # 推論部分をログに追加
        reasoning_log += f"\nIteration {iteration + 1} - Reason: {reason_part}"
        
        #  Act（行動） 
        if act_part.lower().startswith("search:"):
            query = act_part[len("search:"):].strip()
            # ツールを呼び出して結果を取得
            search_result = tools["search"](query)
            # Observation（観察） 
            # 行動の結果を次のループの推論に反映
            observations_log.append(f"Search result for '{query}': {search_result}")
            
        elif act_part.lower().startswith("calc:"):
            expr = act_part[len("calc:"):].strip()
            calc_result = tools["calc"](expr)
            # 行動の結果をObservationとして記録
            observations_log.append(f"Calculation '{expr}': {calc_result}")
            
        elif act_part.lower().startswith("finish:"):
            # 最終回答
            answer = act_part[len("finish:"):].strip()
            break
        else:
            # "Act"が最終回答とみなし、ループを抜ける
            answer = act_part
            break
    
    # 推論ログと観察ログも含めて返す
    full_response = f"{answer}\n\n--- ReAct Process ---\n{reasoning_log}\n\nObservations:\n" + "\n".join(observations_log)
    return full_response

完成

ReActパターンの利用可否をChat botのサイドバーから任意で切り替えられるよう実装しました。これにより、通常の応答モードとReActモードを状況に応じて使い分けることができます。

試しに、前述の例として挙げた「プリンを食べる手順」について、ReActパターンを適用した場合にどのような応答が得られるかを確認してみました。

すると、以下のような返答を得ることができました。

プリンを食べる一般的な手順は以下の通りです：

プリンの容器を開けます。
スプーンを用意します。
プリンの表面からスプーンですくいます。
一口サイズに切り分けて食べます。
カラメルソースがある場合は、プリンと一緒に食べると良いでしょう。
ゆっくりと味わいながら食べます。
日本では特別な食べ方はありませんが、プリンは一般的にデザートとして楽しまれています。個人の好みに応じて、冷たいままや少し常温に戻してから食べるなど、様々な楽しみ方があります。

(Translation: Here are the general steps for eating pudding:

Open the pudding container.
Prepare a spoon.
Scoop from the surface of the pudding with the spoon.
Cut into bite-sized pieces and eat.
If there's caramel sauce, it's good to eat it together with the pudding.
Eat slowly, savoring the flavor.
While there's no special way to eat pudding in Japan, it's generally enjoyed as a dessert. Depending on personal preference, you can enjoy it cold or slightly warmed to room temperature.)

--- ReAct Process ---

Iteration 1 - Reason: To provide information about the steps for eating pudding (プリン), I should first search for the proper procedure in Japanese. This will ensure I have accurate and culturally appropriate information. Iteration 2 - Reason: The search for "プリンを食べる手順" (steps for eating pudding) didn't yield any results. This could be because the query might be too specific or not commonly searched. I should try a more general search about pudding in Japanese culture to gather relevant information. Iteration 3 - Reason: The previous searches for specific instructions on eating pudding in Japanese didn't yield results. This suggests that there might not be a standardized or culturally specific way of eating pudding in Japan that differs significantly from other countries. However, to ensure we're not missing any important information, we should try a more general search about pudding in Japanese cuisine or culture. This might give us insights into how it's typically served or consumed. Iteration 4 - Reason: Based on the previous search attempts and the lack of specific results for eating pudding in Japanese culture, it appears that there might not be a unique or culturally specific method for eating pudding in Japan. However, we can provide general steps for eating pudding that would apply in most contexts, including in Japan. Since we've exhausted our search options and haven't found any specific cultural information, we can proceed to give a general answer about how to eat pudding.

Observations: Search result for 'プリンを食べる手順': 検索結果が見つかりませんでした: プリンを食べる手順 Search result for '日本のプリンの食べ方': 検索結果が見つかりませんでした: 日本のプリンの食べ方 Search result for '日本のプリン 特徴': 検索結果が見つかりませんでした: 日本のプリン 特徴

ReActを適用した場合のみ出力までのステップも出力しているので、モデル側の出力がユーザーの意図に沿っているかどうかを確認できるようになるのは齟齬が減るかもしれませんね。

おわりに

いかがでしたでしょうか。

今回は、AIエージェントの設計手法の一つである ReActパターン を、自作のChat botに組み込む取り組みをご紹介しました。

作成したソースコードは、以下のGitHubリンクからご覧いただけます。

前回同様、Anthropic API を使用しているため、実行には APIキーの取得が必要です。ただし、他のサービス（例：OpenAIやGoogleなど）のAPIキーをお持ちの方は、ソースコード内のselect_modelなどの該当箇所を編集することで、代替のモデルに切り替えてご利用いただくことも可能です。

GitHub - morishitaimpl/AI_agent_test: APIを使用したAIアプリ

APIを使用したAIアプリ. Contribute to morishitaimpl/AI_agent_test development by creating an account on GitHub.

今後の課題

現時点では、ReActを使用した場合と使用しない場合の比較を行っていません。今後は、たとえば「同じタスクを与えてソースコードを生成させる」など、具体的なタスクを通じてその効果を評価する必要があります。

また今回は図1のように、入力プロンプトに対してReActパターンを適用してみましたが、実際にソースコードを生成することを目的とした場合には、ReActの処理を出力側に適用する方がより良質なコードを得られる可能性があると考えています。

参考URL

ReAct エージェントを使用して LLM の能力を最大化する

はじめにみなさんこんにちは！日常生活や業務の中で、大規模言語モデル（LLM）を使って質問に答えてもらったり、簡単なタスクを補助してもらうことに慣れている方も多いのではないでしょうか。LLMの能力は確かに便利ですが、皆さ

AIエージェント入門① - よく使われる 5 つのワークフロー - Qiita

はじめに本記事では、AIエージェントを構築する際によく使われる 5 つのワークフロー Prompt-Chaining Parallelization Routing Evaluator-Optimizer Workflow Orchestrator-Workers W...

AIエージェント入門② - 発展的な 6 つの設計パターン - Qiita

はじめに前回の記事では、Anthropic の Cookbook で紹介されている代表的な 5 つの AI エージェントパターン（Prompt-Chaining、Parallelization、Routing、Evaluator-Optimizer、Orchestrat...