-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use shell environment variables in migrations using Go's text/template parsing #439
base: main
Are you sure you want to change the base?
Conversation
Add support to selectively enable usage of env vars within migrations. Named env vars are enabled though "env-var:VAR_NAME" entries within the up/down block headers., reusing the existing parsing logic The name of activated env vars are then kept with migration options and passed to go templating just before executing relevant migrations.
Use "env" instead of "env-var" as per amacneil#352 (reply in thread)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see a few additional test cases that will codify the following behaviors and help set expectations:
- What does dbmate do if the migration specifies an environment variable (e.g.,
env:FOO
), and that environment variable is unset in the environment at the time dbmate is executed? What should the expected behavior/outcome be? Silently interpolate an empty string? Raise an error? Something else entirely? - What does dbmate do if a migration has a template string (e.g.,
{{ .FOO }}
) andenv:FOO
is not specified in the migration? What should the expected behavior/outcome be? Silently interpolate an empty string? Raise an error? Leave the template fragment as-is? Something else entirely? - What does dbmate do if there are characters in the environment variable that should be escaped, e.g.,
YABBA_DABBA_DOO="this ' is ' a string"
? Is the expectation that the user executing dbmate must ensure that all environment variable contents are properly escaped string values for the target database, and otherwise be susceptible to SQL injections and general errors due to improperly escaped strings? Is this a reasonable expectation? Will this behavior violate the principle of least surprise?
Also, please update the README's section on "Migration files" documenting this new functionality.
Good points: I'd like some more feedback before proceeding with changes. For the 1st and 2nd ones I propose to raise an error, i.e. we could simply add About SQL injections we could abandon templates in favor of query parameters/placeholders, passing env var values i.e. IMHO the downsides of placeholders are:
AFAIK named parameters are only available through additional packages like https://github.com/jmoiron/sqlx, so if the mandate is keeping thing simple, that's not a viable option. OTOH if we keep both templates and placeholders - leaving the migration author choose between risks of SQL injection and portability/positional declarations - we end up violating both the POLA and KISS principle. |
I agree that erroring on unset/unspecified is best. Perhaps we should think about placeholders more? Then we don't need our own variable feature but instead a feature for working with placeholders. |
I'm not sure what you mean. |
What's the actual downside of using sqlx? |
It seems placeholders may end up being too limited and templates will be needed. |
The migration system should be referencing static values in its template variables rather than user input, so SQL injection would have to be an insider attack that uses a static value that makes it through code review (or an environment variable and a controlled deployed environment)- quite different from what we normally think of for SQL injection. With respect to escaping, I think we can put the burden on the user that wants to use the template system. Over time we can provide some helper functions for different types of escaping. I think this PR is good once errors are raised for missing variables. |
Few things I like hearing more than talk of testing. I also like smaller packages. It's possible other maintainers don't want to introduce a new pattern and can speak up, but I am fine with it. |
Fail on referring non existent env vars Increase test coverage
Ouch, I forgot to update docs, but I'll do it soon |
Beef up docs Add test case for erroring on missing env var and js function to mitigate naive sql injection attempts
So I added some references to the I also played with naive sql injection attempts and saw they could be mitigated using the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. Some minor English language corrections requested, but otherwise 👍.
Fix typos and grammar errors
create role '{{ or (index . "THE_ROLE") "adminuser" }}' login password '{{ .THE_PASSWORD }}'; | ||
``` | ||
|
||
The `js` function may help to mitigate risks of SQL injection: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry if I missed the discussion- how does this mitigate SQL injections risks? Can't I insert a string with a single quote still?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The go/template functions includes js
which is described as:
js
Returns the escaped JavaScript equivalent of the textual
representation of its arguments.
While this may not necessarily be sufficient to prevent all possible SQL injection attacks (see: Unicode homoglyph injection attacks aka Unicode smuggling [1] [2], character set transcoding SQL injection attacks [3]), it prevents the most trivial kind where an unescaped single quote '
is in a string that will be interpolated into the SQL query, by escaping it with a backslash, thus turning '
into \'
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trivial case of unescaped single quote is covered by TestResolveMitigatingSqlInjection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice test case!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment in the main PR thread - I'm not a fan of introducing functions to the template syntax, or encouraging use of js escaping functions for sql escaping.
Before merging this, I'm only very slightly concerned that we are technically introducing a backwards-incompatible change, in that if anyone actually has I suspect the likelihood of anyone having @amacneil, how have you handled introducing backwards-incompatible changes into releases? Does the dbmate project follow semantic versioning for its version numbering? If yes, then this probably means needing to bump the version to v3.0.0, and not v2.4.0, which seems a bit much considering the amount of functionality introduced by this change. Perhaps we should put migration file parsing behind a command line argument, default off, which would allow this change to be fully backwards-compatible? @davidecavestro & @gregwebs, how do you feel about that? I propose |
We could also enable templating just when at least one occurrence of OTOH if an additional switch is needed, IMHO a self-explanatory option is preferable, so |
That seems like a great way to do it- it ends up being enabled automatically when needed on a per-migration basis- so it would be fully backwards compatible. |
I'm indifferent to the two options. My concern is guaranteeing backwards-compatibility in all cases, which both options, (a) only enable template parsing when a migration defines an I would vote in favor of enabling Go template parsing with While I personally dislike migration scripts that are not plain text with literal values because it potentially introduces issues with repeatability (a future execution of the unmodified migration script can result in a different migration), I understand others do not try to abide by this constraint and a template-driven migration script might be really useful. Introducing Go template parsing in general opens the door for a wide range of functionality we could make available to migration script writers aside from only interpolating environment variables. We could even go one step further and remove the entire Users could then use Something to consider when deciding between enabling templating if and only if |
If you want reproducibility, the current system has it- it just needs to record the values of the variables used. We can do auto-enabling now and then still add the flag in the future if we support reading in template values that are not pre-declared. |
Yes, following semver. In general I try not to make backwards incompatible changes to the CLI/migration spec.
Agreed, better if we can do it in a backwards compatible way.
My initial reaction is that this goes against dbmate's philosophy of keeping things simple. Now we are introducing a whole templating language (and one that will be completely unfamiliar to non-Go developers). I've stated in the past that I think env vars in migrations are not a great idea, but if we are going to support it, then I think we should strictly limit this to env vars and not support any other functions.
Agreed, this seems sufficient to ensure backwards compatibility (existing migrations will not break, future migrations that use this feature will need to be aware of it.
I disagree with adding the Here is an alternative solution to address the problem of sql injection:
|
I do agree with this, and that would be a perfectly reasonable stance. However, the only real use case I have heard articulated for this feature seems to revolve around not storing passwords in your migration files and injecting them at runtime. In that case, it would be nice if your long secure passwords don't accidentally trip up migrations due to accidental sql injection (even if it will not be exploited). So I think doing my automatic string escaping proposal above would be worthwhile, as long as we give people an escape hatch. |
Another use case that I saw was to use a different schema name for a different environment. So rather than inserting a string, it would be inserting a database name. |
I have seen other migration systems use templating for some thing, like adding updated at and created at columns to a table with the trigger. There is a significant amount of boiler plate around attaching that to all the tables. However, it would probably be better to do such a thing in a pre-phase that still generates a static SQL output. |
We could expose a new command "dbmate template" for those that want to do ad hoc templating. It would be used as a pre phase- take a file as an input and output, a file. Then, for this environment, variable approach we could only support placeholders values such as strings. |
ok guys, it's plenty of viable options here, and it's certainly up to you maintainers driving the change with proper vision. I drafted this PR with the aim to keep things really simple, avoiding to tackle with SQL injection for many reasons. That said, I totally understand if you prefer to postpone and or/restart designing the whole thing from scratch to get better features. In other words I'm not in a hurry to have this feature: I contributed this PR just to give back and keep hands on golang. |
Thanks for the understanding. It is obvious we are lacking in a process to agree on how new features should behave before they start being developed. I guess one question that is being asked now is- what does the implementation look like to focus on the password insert case and use a placeholder for a string? Are there inherit difficulties or is there a straight-forward approach. As stated, I think even if we go that route we can still expose full templating for those that want it. |
The upside is that if we ensure backwards compatibility, and enable the templating functionality if and only if the user opts-in to using it by explicitly activating it, either through a flag in a migration script or a command line arg, whatever path we decide to take, then it becomes completely optional.
Personally, I agree with you, that anything that modifies the migration before it's executed isn't a great idea. In my opinion, one of dbmate's strengths is its forced simplicity, which makes its execution behavior easier to reason about and predict exactly what the outcome will be, and there's incredible value in that. It's this characteristic that drove my decision to use dbmate over alternate options.
Unfortunately, if we use Go's It's a shame that there's no easy way to disable the built-in functions, without forking the package and removing these lines of code: Oh, I just had an idea after looking at that code: maybe it's possible to define a custom function map with function names that shadow the built-in names, and make the implementation of the custom functions raise errors prevent them from being used. Not sure the juice is worth the squeeze, though.
If
Hm. This would require the change I just described above, making a database-specific quoting function part of the database
Maybe for now we only support the Should we proceed with changing |
I don't think there's anything inherently unsafe here- again this isn't dealing with arbitrary user input- it's variables values are known ahead of time, except for passwords. Randomly generated passwords aren't really a threat- this would have to be a user supplied password. That just needs to be given as a string. It might make more sense to use terms like escaped, quoted, raw, text, etc and make a note in the docs about passwords. |
In this context, "unsafe" communicates that you, the user, are wholly and entirely responsible for escaping the input string as necessary, that passing through values should be considered an unsafe operation if you would have otherwise expected your strings to be escaped and quoted for you. Essentially, "unsafe" is a warning signal to the operator that there are NO protections offered here: you are wielding a footgun. You are responsible for how you aim it. Do not blame anyone but yourself if the string you pass in causes bad things to happen because you failed to escape it properly. |
If we can't disable the Go template helpers, then how about we just do our own templating using E.g. this would work fine: -- migrate:up unsafe-env:ROLE env:PASSWORD
CREATE USER ${ROLE} WITH PASSWORD ${PASSWORD}; While we're here, it should probably be an error to run a migration that specifies an environment variable if that environment variable is unset (empty is ok, unset should be an error). |
I found a nuance / issue that I wanted to let you know about. In a db migration, if I want to use variables I follow this syntax....
Therefore I would assume that for the down the syntax would be the same
But I get an error...
BUT I did figure out that the following syntax works...
If you need any more details please let me know. I have the latest from your branch and I have a current version of GoLang [greg@CM-LT-0007 dbmate (env-vars)]$ git status go version go1.20.6 darwin/arm64 |
Will this PR be released soon? Thanks. |
@o-lee There's no expected release date for this PR. At a minimum, I think this needs to be implemented before this is safe to merge:
Or, alternatively, this:
And, someone needs to investigate @DevopsMercenary's issue and determine what's going on. We need someone to work on these. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on @DevopsMercenary's comment, it's clear that we need more test cases. Our current tests only cover migrate:down
with only one variable. At a minimum, we need a test with one, and a test with more than one.
@dossy, thanks for the update. |
Would there be appetite for me to pick this back up and address all of these comments? |
As I have no horse in this race—I have no need for this functionality, nor plan to use it—I am trying to limit my feedback to technical issues and concerns. Whether there's demand for this by others who are waiting to use it, they should speak up and represent themselves. |
I gave up a long time ago, so I'm very happy if you're willing to squeeze value out of it. |
I'm using this branch and would still really like to see this feature. It really helps with keeping the scripts DRY ( don't repeat yourself ) an important principal we prefer to follow. |
I don't have a personal need for this feature, but happy for someone to take it across the finish line (including tests etc). I stand by the syntax recommendation I made in this comment #439 (comment) -- migrate:up unsafe-env:ROLE env:PASSWORD
CREATE USER ${ROLE} WITH PASSWORD ${PASSWORD};
Edit: Potentially someone might want to use both the escaped and unescaped value in same migration? Seems unlikely, but that could be supported like this instead: -- migrate:up env:ROLE env:PASSWORD
CREATE USER ${unsafe:ROLE} WITH PASSWORD ${PASSWORD}; |
@amacneil coincidentally I started a branch last night where I replaced templating with |
Add support to selectively enable usage of env vars within migrations.
Named env vars are enabled though "env:VAR_NAME" entries within the up/down block headers., reusing the existing parsing logic
The name of activated env vars are then kept with migration options and passed to go templating just before executing relevant migrations.
See discussion #352 and issue #118