Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: better repr for DataType #10391

Open
1 task done
NickCrews opened this issue Oct 29, 2024 · 2 comments
Open
1 task done

feat: better repr for DataType #10391

NickCrews opened this issue Oct 29, 2024 · 2 comments
Labels
feature Features or general enhancements

Comments

@NickCrews
Copy link
Contributor

Is your feature request related to a problem?

This is a just a small papercut. If I have a complex dtype, the repr is hard to read, all in one line:

import ibis

dt = ibis.dtype(
    "array<struct<line1: string, line2: string, city: string, state: string, zipcode: string>>"
)
dt
# Array(value_type='Struct([('line1', String(nullable=True)), ('line2', String(nullable=True)), ('city', String(nullable=True)), ('state', String(nullable=True)), ('zipcode', String(nullable=True))], nullable=False), nullable=False)

I think the most readable form would be a pretty-printed version of the string DSL (option A):

dtype("""!array<
    !struct<
        line1: string,
        line2: string,
        city: string,
        state: string,
        zipcode: string
    >
>""")

which is also able to be copy-pasted back into a REPL to recreate the original dtype.

Could also just do the string (option B):

!array<
    !struct<
        line1: string,
        line2: string,
        city: string,
        state: string,
        zipcode: string
    >
>

We should be clever, and only do newlines/pretty if the type is over some metric of complexity, perhaps just string length?

eg I would want array<string> and not

array<
    string
>

I am basing these ideas off of two needs: human readbility, and being able to repro the dtype in the REPL from the output. Are there other considerations I'm not thinking of?

What is the motivation behind your request?

No response

Describe the solution you'd like

see above

What version of ibis are you running?

main

What backend(s) are you using, if any?

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@cpcloud
Copy link
Member

cpcloud commented Oct 30, 2024

We should be clever, and only do newlines/pretty if the type is over some metric of complexity, perhaps just string length?

I'd encourage you to look into the staggering complexity of pretty printing a string before taking any action on this.

We should definitely avoid anything that moves us in the direction of having to consider line breaks and string length when formatting the type string.

@NickCrews
Copy link
Contributor Author

Ooof, ok thanks for that, I didn't expect it would be that tricky, but I'm sold.

What about going to the more succinct "array<>" DSL, but that's the only change, and keeping everything on one line as it currently is?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Features or general enhancements
Projects
Status: backlog
Development

No branches or pull requests

2 participants