Taming GraalVM Reflection with AI Agents: Lessons from Testing 1000 Libraries Similarity score = 0.75 More