Create tutorial to get ZTF light curves from HATS catalog#325
Create tutorial to get ZTF light curves from HATS catalog#325jaladh-singhal wants to merge 7 commits into
Conversation
| right_on='oid', | ||
| how='inner' | ||
| ) | ||
| combined_df |
There was a problem hiding this comment.
Just FYI, I tried 3 more approaches before settling on this one (which takes takes 2m±5s both compute calls combined):
ztf_objects_cone.join(ztf_lc_cone, ...).compute()takes 2m±5sztf_objects_cone.merge(ztf_lc_cone, ...).compute()takes 2m±5sztf_objects_cone.compute(); ztf_lc_cone.id_search(values={'objectid': list(ztf_objects_cone_df.oid)}).compute()took 2m50s
I kept this approach because of time as well as to maintain the narrative of keeping objects table search optional.
|
I've pushed a commit directly that fixes the oldestdeps job failure, review is to come separately |
|
I'm not sure what goes on in circleCI, we may actually hit that memory limit even though the graph doesn;t show it (but the resolution of the graph is pretty bad, so it's still my prime suspect for the reason for the failure) -- keep an eye on the GHA buildhtml job instead for now. |
|
@bsipocz buildhtml job is getting skipped (I tried re-triggering it) |
troyraen
left a comment
There was a problem hiding this comment.
Thanks @jaladh-singhal! I'm requesting changes that are important but small. The meat of this is great. Thanks for putting it together so fast.
I ran this on Fornax and found that we can reduce memory usage to a max of <8G by using a dask client for the index search (in addition to the others that you already noticed). Hopefully that will be enough for it to run in CI. 🤞
I flagged two things that often confuse users (how objects are defined, and how the object ID column is named) and suggest we add admonitions for those.
There's two languaging things I think we should change. I commented on only the first instance of each of these, so look for other instances throughout the notebook.
-
I think "object" or "target" will be more clear to ZTF users than "source". "source" means different things to different people throughout astronomy, so you're usage isn't wrong per se, but in my experience with time-domain use cases and surveys like ZTF it almost always means a single observation (ie, one point in the light curve).
-
"Objects Table", "Lightcurves", and "HATS Collection" are proper nouns so use those spellings and capitalizations consistently. (In case it's confusing, "Lightcurves" is the name of the catalog, while a "light curve" is a time series of data points for a given object (and can also be plural). It's usually obvious which is meant, so that spelling should be used. But there are cases where a sentence/phrase is equally correct either way, so then you can just pick one.
...And then there's also the column name, which is spelled "lightcurve" 😅.)
|
|
||
| ```{code-cell} ipython3 | ||
| # Uncomment the next line to install dependencies if needed. | ||
| # !pip install s3fs "lsdb>=0.6.6,<0.8" pyarrow pandas astropy matplotlib |
There was a problem hiding this comment.
Is lsdb<0.8 just for our own CI and due to other notebooks? I wonder if/how we can handle that without implying to end users that this particular notebook requires lsdb<0.8.
There was a problem hiding this comment.
There was a problem hiding this comment.
FYI @bsipocz, @jaladh-singhal noticed (very) incorrect results returned by lsdb cone search and we tracked it to a version issue. Upgrading to lsdb>=0.8 fixes it. I double checked the HATS catalogs and the data are fine. Calculating separation using astropy shows that cone search returns objects that are well outside the cone. Not sure what's going on there. It's possible we're missing a crucial kwarg or something, but our upper pin of v0.7.3 is two major versions old now so better to spend our time trying to unpin. @jaladh-singhal will open a separate PR for that.
There was a problem hiding this comment.
Opened #331 - we need to wait for that to be merged before this one is merged.
|
Good news! Thanks to @troyraen's suggestions, the build-docs pipeline is passing now!! |
Fixes IRSA-7768