Exactly, searching for text is one reliable way to do it!
You can get even more dynamic than that, but it digs a little deeper than just understanding the fundamentals. ?
Edit: Also, after look at your code more, you have this:
smithFace.interact("Make sets:");
You should do:
if(smithFace.interact("Make sets:")){
// Code in here
}
The reason you should do this is because you ONLY want to move the mouse and do the conditional sleep, or anything else you can think of, ONLY if the action was successful. Since it is a boolean, it will return true or false.