Just make sure to test the regex instead of blindly slapping it in assuming it works π
Programming
Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!
Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.
Hope you enjoy the instance!
Rules
Rules
- Follow the programming.dev instance rules
- Keep content related to programming in some way
- If you're posting long videos try to add in some form of tldr for those who don't want to watch videos
Wormhole
Follow the wormhole through a path of communities [email protected]
What if I say "it's probably okay just this one time" before I do it every time?
Ah I've tested this method, shit breaks a lot. Still my go to.
Can we just have another LLM check the work for us? Like an LLM-GAN?
I'm not sure if it's still the case, but asking it to review what it just wrote for errors has led to significant quality improvements previously.
The new Code Interpreter plugin that went live for this week for Plus users can actually execute Python code on a sandboxed environment. This allows you to add "Write and execute tests for the regex" to the end of your prompt.
Regex101 is a sandbox env specifically for Regex
Not just for writing, and testing samples. It will also explain the parts of the regex.
However it won't generate examples that will pass the regex - which may be the biggest benefit of chatGPT.
This is the way. Everything ChatGPT produces for me gets tested and debugged here.
This is where I go to validate the work of ChatGPT. The debugging capabilities in that site are wonderful.
Iβve tried it and found it wanting at regex and excel formulas, but Iβm glad to hear itβs working for you! Are you using 4? I havenβt tried that one and I hear itβs better.
I typically try 3.5 first and switch to 4 if the results aren't great. 3.5 typically handles basic use cases quite well, for example, writing regex that detects jira ticket naming nomenclature. For more complex things, I go to 4.
It sometimes gets things wrong, but I've also found that just saying "that didn't work" gets it to reevaluate for more complex situations
it helps if you hold ChatGPTs hand and walk it through what you need. For example if you have a regex with 3 requirements, ask it to write a regex for the first requirement, then ask it to modify the previous output to add another requirement, and so on. that way you can sort of "audit" it as it generates the correct regex.
there is some more discussion of this in a similar post from a few days ago.
So I was trying to write a regex for use with my ChatGPT discord bot. I wanted to trim off any final partial sentence at the end. I went around and around with it for a couple of hours because look ahead and look behind are just not something I do often enough.
It kept writing more and more complicated regex that didnβt work. The final solution, while not exactly perfect - it wonβt keep a quote at the end of a sentence, and honorifics like Mr. and Dr. throw it - it wasnβt nearly as complicated as ChatGPT was making it. It still never did give me anything working - I just fucked around on regex101 until I got it right. As usual but having wasted 90 minutes or so.
I've found that you need to be very careful when asking it to modify things it produced directly without making significant changes to the regex it provides. Once I get to the 3rd or 4th iteration of asking it to modify previous responses I've found the likelihood that it starts hallucinating to increase dramatically. The best solution I've found to this is to put your entire request in a single prompt that walks it through all requirements step-by-step.
You can improve the reliability if you provide it test cases. You can now be the PM you wish you had for the robot that will eventually replace you.
I hated everything about this comment, thanks.
Also curious. If I had some AI help with regex that would be awesome. But I felt as you said it wouldnβt work great without 4. Which I donβt have.
I agree, my regex experience was not great.
If you think regex is the hard part of programming, then you're in for a bad time.
I often need to deal with half a dozen different programming languages in any day/week and the context switching can be difficult at times. When you've spent all day switching between JavaScript, Python, and YAML and suddenly need to draft some Regex, tools like ChatGPT can help immensely at reducing the mental burden of switching gears.
The syntax of regular regexes is the same across languages though. It's just the regex library which is different, but so is every other library between languages.
If the project is less than a thousand lines of code in a language with a garbage collector, it probably is. Most other problems don't require learning a DSL to handle them, and most other DSL's aren't nearly as terse.
"i have this problem I know what I'll do! I'll use regex to fix it!"
Uses regex.
"Yay problem is now fixed it works!"
Now has 2 problems.
totally!!!
Thanks for this post, I use regex a often and did not know gpt would be good at this..
That's the problem. It will confidently give you an correct sounding answer.
If it is actually true is a different topic. So don't just blindly trust it. Verify, or at least sanity check it.
This this this!!! I know this is a post from the place that shall not be named, but it just showcases the issues with ChatGPT (this is from when GPT4 was just released)
My biggest problem with it has been that it doesn't necessarily understand that some things are impossible - for example, variable-length lookbehinds.
That depends on the regex flavor. Some of them have full support for variable length lookbehinds, for example JavaScript and third-party regex
module for Python.
A variable length lookbehind is the same as the opposite of a variable length lookahead.
You can also ask it to do write VBA code for Excel, or Jira queries.
Still a bit new to Jira, what are Jira queries?
Typically called JQL, it's a simple query language to find info. For example, there's a simple query to find epics with a particular affects version and/or fix version, or return epics that are missing information in a particular field.
The default or basic Jira can't do some things though. Like I haven't been able to get the total number of story points from issues within an epic. I think you need a 3rd party plugin for that.
That's nice to know, hopefully I can bring that up during our sprint planning sessions when necessary.
I have yet to see a regex that is so complicated that I would need some help. I expect programmers to know how to use regexes but it seems that it's not the case. And when it becomes too big, you always can write verbose regexes with comments, it's even easier. If someone could show me something too difficult for a human being (excluding the regex to validate emails), I'm interested.
Regex isn't difficult, just annoying to ensure it is bug-free. If ChatGPT can help, then I don't know why you wouldn't be in favor of it
It's not that I'm incapable of evaluating regex, but rather the mental burden of evaluating complex regex statements and determining their purpose can be time-consuming. Why take 20 minutes to understand some regex when ChatGPT can do it in 20 seconds?
A coworker once defined regex as a write-only language and he definitely had a point. I love regex but it can be time consuming figuring out exactly what a complex regex expression is doing.
It's often developers who never took a finite automata class who I've seen struggle with regular expressions.
It's kind of like writing code in C while not understanding how memory management works
Huh. That class looked hard as hell, I didn't take it, and now I'm 2 years out of school still googling regex every time I need it.
Maybe I should do some reading π
It was mandatory. I'm glad I took it, but I'm glad it's over πππ
Just look up how finite automatas work. You don't need to understand turing machines or turing completeness
Wait, you guys donβt use AI to make regex?
I use regex101.com
Up to now that usually was faster than trying to get chatGPT to generate something worthwhile. However, if you define some test cases first, the combination of both will even get the sales guy there eventually.
Ugh god itβs been a shit day with sales, letβs not bring them up. The turds.
I tried it and naaah it's not that great. Keeps giving a rule for sample text too, despite really making it clear that I want a more general one.