-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different optimization traces & optim on a shared code #17
Comments
Good question! In Trace, you can have multiple optimizers (each attends to separate parameters). The paths where those parameter are used can be shared. After calling Is this what you meant? It will be helpful if you can help clarify your scenario. E.g., Do you mean using different optimizers for different parameters? I think it will be most clear if you can write down a toy example of such a case, so I can better help :) |
Hi, thanks a lot. Here is a toy example: from opto.trace import bundle, node
from opto.optimizers import OptoPrime
# Define a shared function with optimization capability
@bundle(trainable=True)
def shared_function(data, param):
"""Shared function processes data with parameter adjustments."""
return data * param + 1
# Define the AgentMonitoring class with its own optimizer
class AgentMonitoring:
def __init__(self, name, param1, param2):
self.name = name
self.param1 = node(param1, trainable=True)
self.param2 = node(param2, trainable=True)
self.optimizer = OptoPrime([self.param1, self.param2])
@bundle()
def process_step1(self, input_data):
"""First step of agent processing."""
return shared_function(input_data, self.param1)
@bundle()
def process_step2(self, intermediate_data):
"""Second step of agent processing."""
return shared_function(intermediate_data, self.param2)
@bundle()
def process(self, input_data):
"""Complete processing pipeline for the agent."""
intermediate = self.process_step1(input_data)
return self.process_step2(intermediate)
def optimize(self, feedback):
"""Optimize agent parameters based on feedback."""
self.optimizer.backward(self.param1, feedback)
self.optimizer.backward(self.param2, feedback)
self.optimizer.step()
# Define inter-agent workflow optimization with specific optimizer
@bundle()
def agent_interaction_loop(agent1, agent2, data, iterations, optimizer, optimize_every):
"""Defines workflow between two agents with iterative refinement."""
result = data
for i in range(iterations):
step1 = agent1.process(result)
result = agent2.process(step1)
if (i + 1) % optimize_every == 0:
print(f"Inter-agent optimization step at iteration {i + 1} for {agent1.name}-{agent2.name}")
optimizer.backward(result, "Output should be larger.")
optimizer.step()
return result
# Define multi-agent workflow optimization with specific inter-agent optimizers
@bundle()
def multi_agent_workflow_loop(agents, data, iterations, workflow_optimizer, optimize_every):
"""Multi-agent collaborative optimization with diverse actions."""
result = data
interaction_optimizers = [
OptoPrime([agents[j].param1, agents[j].param2, agents[j + 1].param1, agents[j + 1].param2])
for j in range(len(agents) - 1)
]
for i in range(iterations):
# Inter-agent interactions
for j, optimizer in enumerate(interaction_optimizers):
result = agent_interaction_loop(
agents[j], agents[j + 1], result, iterations=1, optimizer=optimizer, optimize_every=1
)
# Independent agent optimizations
for agent in agents:
result = agent.process(result)
agent.optimize("Individual agent optimization feedback.")
# Periodic optimization for the entire workflow
if (i + 1) % optimize_every == 0:
print(f"Multi-agent optimization step at iteration {i + 1}")
workflow_optimizer.backward(result, "Workflow optimization feedback.")
workflow_optimizer.step()
return result
# Set up agents with distinct parameters
agent1 = AgentMonitoring("Agent1", param1=1.0, param2=1.5)
agent2 = AgentMonitoring("Agent2", param1=2.0, param2=2.5)
agent3 = AgentMonitoring("Agent3", param1=3.0, param2=3.5)
# Initialize data and workflow optimizer
data = node(10, trainable=False)
workflow_optimizer = OptoPrime([
agent1.param1, agent1.param2,
agent2.param1, agent2.param2,
agent3.param1, agent3.param2
])
# Run multi-agent workflow with specific inter-agent optimizers
result = multi_agent_workflow_loop(
[agent1, agent2, agent3],
data,
iterations=10,
workflow_optimizer=workflow_optimizer,
optimize_every=5
)
print(f"Final result of multi-agent workflow: {result.data}")
# Print optimized parameters
print(f"Optimized Parameters:")
print(f"Agent1: param1={agent1.param1.data}, param2={agent1.param2.data}")
print(f"Agent2: param1={agent2.param1.data}, param2={agent2.param2.data}")
print(f"Agent3: param1={agent3.param1.data}, param2={agent3.param2.data}") |
This is really helpful. I haven't tried running the code, but basically you want the same parameters to be optimized by multiple optimizers with different feedback, right? For example, by
Another thing off in the example is that "shared_function" is not really optimized, as its parameter is not given to any optimizer. Hope this helps. |
Thanks so much for your detailed response and for pointing out those issues in this toy code. You're right—I missed calling zero_feedback before each backward call, and I wasn't aware that I needed to set retain_graph=True when performing multiple backward passes on the same graph. I understand that retain_graph=True will allow to call backward more than 1 time on the same node. Therefore, is there any unexpected behaviour after being called several times by for example optimizer2 ? Should it be prevented by calling zero_feedback() at the right time ? Regarding the shared_function, you're correct that its parameter is not being optimized. It is a toy code and I do not yet need to optimize the code of a shared function but it will happen. Therefore I am wondering if you already had such a case ? Would calling step() by optimizer2 update the function's code based on the latest version of the code updated by any optimizer calling step() or the latest call of step() only from optimizer2 ? I have to admit that I am still not clear about the status of shared nodes when trained or just used by several optimizers. |
Calling Whether this causes issues depends on how the optimizer aggregates the feedback dict. Currently, the optimizer implementation performs a full aggregation, combining the feedback (graphs) of each child and then the resulting aggregated graphes from different children finally as a single graph. We currently assume graphs can only be combined when their output feedback is identical (otherwise, it will throw an error). So "is there any unexpected behaviour after being called several times by for example optimizer2 ?" Likely in your scenario you will see a runtime error if the feedback provided in each
If say Hope this answers your questions. |
Is it possible to support different traces & optimization on shared functions/nodes using decorators ? What would be the best practice ?
In my case, I need to do different traces & optimization on shared code files/functions:
Thanks a lot
The text was updated successfully, but these errors were encountered: