this post was submitted on 09 Jul 2023

241 points (94.5% liked)

Programming

17558 readers

378 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities [email protected]

founded 2 years ago

MODERATORS

[email protected]

241

LPT: ChatGPT is incredible for generating and evaluating regex (programming.dev)

submitted 1 year ago by [email protected] to c/[email protected]

51 comments fedilink hide all child comments

I have to use a ton of regex in my new job (plz save me), and I use ChatGPT for all of it. My job would be 10x harder if it wasn't for ChatGPT. It provides extremely detailed examples and warns you of situations where the regex may not perform as expected. Seriously, try it out.

top 48 comments

sorted by: hot top controversial new old

[–] [email protected] 80 points 1 year ago (2 children)

Just make sure to test the regex instead of blindly slapping it in assuming it works 🙂

[–] [email protected] 29 points 1 year ago (1 children)

What if I say "it's probably okay just this one time" before I do it every time?

[–] [email protected] 9 points 1 year ago (1 children)

Ah I've tested this method, shit breaks a lot. Still my go to.

[–] [email protected] 4 points 1 year ago (1 children)

Can we just have another LLM check the work for us? Like an LLM-GAN?

[–] [email protected] 4 points 1 year ago

I'm not sure if it's still the case, but asking it to review what it just wrote for errors has led to significant quality improvements previously.

[–] [email protected] 8 points 1 year ago (1 children)

The new Code Interpreter plugin that went live for this week for Plus users can actually execute Python code on a sandboxed environment. This allows you to add "Write and execute tests for the regex" to the end of your prompt.

[–] [email protected] 15 points 1 year ago* (last edited 1 year ago) (3 children)

Regex101 is a sandbox env specifically for Regex

[–] [email protected] 5 points 1 year ago* (last edited 1 year ago)

Not just for writing, and testing samples. It will also explain the parts of the regex.

However it won't generate examples that will pass the regex - which may be the biggest benefit of chatGPT.

[–] [email protected] 3 points 1 year ago

This is the way. Everything ChatGPT produces for me gets tested and debugged here.

[–] [email protected] 1 points 1 year ago

This is where I go to validate the work of ChatGPT. The debugging capabilities in that site are wonderful.

[–] [email protected] 28 points 1 year ago (4 children)

I’ve tried it and found it wanting at regex and excel formulas, but I’m glad to hear it’s working for you! Are you using 4? I haven’t tried that one and I hear it’s better.

[–] [email protected] 15 points 1 year ago (2 children)

I typically try 3.5 first and switch to 4 if the results aren't great. 3.5 typically handles basic use cases quite well, for example, writing regex that detects jira ticket naming nomenclature. For more complex things, I go to 4.

It sometimes gets things wrong, but I've also found that just saying "that didn't work" gets it to reevaluate for more complex situations

[–] [email protected] 21 points 1 year ago

it helps if you hold ChatGPTs hand and walk it through what you need. For example if you have a regex with 3 requirements, ask it to write a regex for the first requirement, then ask it to modify the previous output to add another requirement, and so on. that way you can sort of "audit" it as it generates the correct regex.

there is some more discussion of this in a similar post from a few days ago.

[–] [email protected] 4 points 1 year ago (1 children)

So I was trying to write a regex for use with my ChatGPT discord bot. I wanted to trim off any final partial sentence at the end. I went around and around with it for a couple of hours because look ahead and look behind are just not something I do often enough.

It kept writing more and more complicated regex that didn’t work. The final solution, while not exactly perfect - it won’t keep a quote at the end of a sentence, and honorifics like Mr. and Dr. throw it - it wasn’t nearly as complicated as ChatGPT was making it. It still never did give me anything working - I just fucked around on regex101 until I got it right. As usual but having wasted 90 minutes or so.

[–] [email protected] 1 points 1 year ago

I've found that you need to be very careful when asking it to modify things it produced directly without making significant changes to the regex it provides. Once I get to the 3rd or 4th iteration of asking it to modify previous responses I've found the likelihood that it starts hallucinating to increase dramatically. The best solution I've found to this is to put your entire request in a single prompt that walks it through all requirements step-by-step.

[–] [email protected] 11 points 1 year ago (1 children)

You can improve the reliability if you provide it test cases. You can now be the PM you wish you had for the robot that will eventually replace you.

[–] [email protected] 2 points 1 year ago

I hated everything about this comment, thanks.

[–] [email protected] 6 points 1 year ago

Also curious. If I had some AI help with regex that would be awesome. But I felt as you said it wouldn’t work great without 4. Which I don’t have.

[–] [email protected] 5 points 1 year ago

I agree, my regex experience was not great.

[–] [email protected] 21 points 1 year ago (2 children)

If you think regex is the hard part of programming, then you're in for a bad time.

[–] [email protected] 4 points 1 year ago (1 children)

I often need to deal with half a dozen different programming languages in any day/week and the context switching can be difficult at times. When you've spent all day switching between JavaScript, Python, and YAML and suddenly need to draft some Regex, tools like ChatGPT can help immensely at reducing the mental burden of switching gears.

[–] [email protected] 15 points 1 year ago

The syntax of regular regexes is the same across languages though. It's just the regex library which is different, but so is every other library between languages.

[–] [email protected] 1 points 1 year ago

If the project is less than a thousand lines of code in a language with a garbage collector, it probably is. Most other problems don't require learning a DSL to handle them, and most other DSL's aren't nearly as terse.

[–] [email protected] 18 points 1 year ago (2 children)

"i have this problem I know what I'll do! I'll use regex to fix it!"

Uses regex.

"Yay problem is now fixed it works!"

Now has 2 problems.

[–] [email protected] 2 points 1 year ago

I feel bad for you, son...

[–] [email protected] 1 points 1 year ago

totally!!!

[–] [email protected] 6 points 1 year ago (1 children)

Thanks for this post, I use regex a often and did not know gpt would be good at this..

[–] [email protected] 14 points 1 year ago (1 children)

That's the problem. It will confidently give you an correct sounding answer.

If it is actually true is a different topic. So don't just blindly trust it. Verify, or at least sanity check it.

[–] [email protected] 8 points 1 year ago

This this this!!! I know this is a post from the place that shall not be named, but it just showcases the issues with ChatGPT (this is from when GPT4 was just released)

[–] [email protected] 6 points 1 year ago (2 children)

My biggest problem with it has been that it doesn't necessarily understand that some things are impossible - for example, variable-length lookbehinds.

[–] [email protected] 2 points 1 year ago

That depends on the regex flavor. Some of them have full support for variable length lookbehinds, for example JavaScript and third-party regex module for Python.

[–] [email protected] 1 points 1 year ago

A variable length lookbehind is the same as the opposite of a variable length lookahead.

[–] [email protected] 4 points 1 year ago (1 children)

You can also ask it to do write VBA code for Excel, or Jira queries.

[–] [email protected] 2 points 1 year ago (1 children)

Still a bit new to Jira, what are Jira queries?

[–] [email protected] 3 points 1 year ago (1 children)

Typically called JQL, it's a simple query language to find info. For example, there's a simple query to find epics with a particular affects version and/or fix version, or return epics that are missing information in a particular field.

The default or basic Jira can't do some things though. Like I haven't been able to get the total number of story points from issues within an epic. I think you need a 3rd party plugin for that.

[–] [email protected] 2 points 1 year ago

That's nice to know, hopefully I can bring that up during our sprint planning sessions when necessary.

[–] [email protected] 4 points 1 year ago (3 children)

I have yet to see a regex that is so complicated that I would need some help. I expect programmers to know how to use regexes but it seems that it's not the case. And when it becomes too big, you always can write verbose regexes with comments, it's even easier. If someone could show me something too difficult for a human being (excluding the regex to validate emails), I'm interested.

[–] [email protected] 7 points 1 year ago

Regex isn't difficult, just annoying to ensure it is bug-free. If ChatGPT can help, then I don't know why you wouldn't be in favor of it

[–] [email protected] 6 points 1 year ago (1 children)

It's not that I'm incapable of evaluating regex, but rather the mental burden of evaluating complex regex statements and determining their purpose can be time-consuming. Why take 20 minutes to understand some regex when ChatGPT can do it in 20 seconds?

[–] [email protected] 8 points 1 year ago

A coworker once defined regex as a write-only language and he definitely had a point. I love regex but it can be time consuming figuring out exactly what a complex regex expression is doing.

[–] [email protected] 2 points 1 year ago (1 children)

It's often developers who never took a finite automata class who I've seen struggle with regular expressions.

It's kind of like writing code in C while not understanding how memory management works

[–] [email protected] 1 points 1 year ago (1 children)

Huh. That class looked hard as hell, I didn't take it, and now I'm 2 years out of school still googling regex every time I need it.

Maybe I should do some reading 😅

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

It was mandatory. I'm glad I took it, but I'm glad it's over 😂😂😂

Just look up how finite automatas work. You don't need to understand turing machines or turing completeness

[–] [email protected] 3 points 1 year ago (1 children)

Wait, you guys don’t use AI to make regex?

[–] [email protected] 11 points 1 year ago* (last edited 1 year ago) (1 children)

I use regex101.com

Up to now that usually was faster than trying to get chatGPT to generate something worthwhile. However, if you define some test cases first, the combination of both will even get the sales guy there eventually.

[–] [email protected] 2 points 1 year ago

Ugh god it’s been a shit day with sales, let’s not bring them up. The turds.

[–] [email protected] 1 points 1 year ago

I tried it and naaah it's not that great. Keeps giving a rule for sample text too, despite really making it clear that I want a more general one.

load more comments