pici

Powered by 🌱Roam Garden

Rawsugar Recipes

  • make-public
  • Refs
  • Tasks
    • Get one step (upload manifest) working full cycle
      • Merge local/gs upload for the sake of step UI
      • {{DONE}} Link step to op
      • {{DONE}} Display link to uploaded sheet in step when complete
    • {{TODO}} Lacey idea -- each run-step should have a notes field
      • backend trivial, UI needs some design
  • Ontology
    • (getting into the class/prototype weeds)
    • 2.1 recipe
      • has multiple recipe-steps
    • 2.2 recipe-step
      • has predecessors, succesors
    • 2.3 run
      • part-of batch
      • has-a recipe (? maybe not direect, inherit from batch or datatype (but consistency with run-step))
      • has time and agent info (so do steps, is that weird?)
    • 2.4 run-step
      • part-of a run
      • has-a recipe-step
      • has-a state (todo, in-progress, done etc)
      • has predecessors, succesors
      • has time and agent info
    • 2.5 some informal step types
      • upload
      • match files
      • gate in cellengine
      • (note this conceivably has substeps; do we want to have this built in)
      • send files to RStudio (for clustering eg)
      •  export to CANDEL
  • UI
  • Questions (for Lacey mostly)
    • Defining recipes per-batch, but do we ever want steps that combine data types or otherwise involved multiple batches?
      • Lacey: no current cases, mgiht be in the future
    • Are recipes 1-1 with datatypes (if so don't need to have both fields on batch)
      • NO, there might be different recipes
    • evolution of recipes over time (think)
    • Since each step can generate its own files, we should be associating files with steps rather than batches (or with both…unclear). Mentioned in comment in recipe doc.
  • Questions (for me mostly)
    • 5.1 Steps and ops are not that different, might consider sharing some infrastructure.
    • 5.2 Use a prototype approach?
      • It's not like Datomic has a real class system anyway (although that is what Alzabo provides, sort of)
      • Explicit Alzabo support? Not sure what that would look like or if its ncessary.
      • Hm, this could be a general EDN thing…and might already exist. What about that thing Rob uses for configs? Aero? Oh hey looks like it might work with merge and ref tags…
      • Wait a minit, a purely syntactic inheritance is not going to work for this…
      • Practicalities: instead of separate types for recipe/run and recipe-step/run-step, they are mooshed together.
      • Extra attributes? :entity/prototype and maybe :entity/prototype?
    • 5.2.1 A sudden obvious insight
      • The recipe-level stuff doesn't even have to be in Datomic; it's defined by EDN and can stay that way.
      • Still useful to have prototype-like inheritance in EDN definition file. So Aero might be useful after all.
    • 5.3 Acceptance / types
      • Means we need to bring back column labeling. Probably a good thing
    • 5.4 Task State and multiple tries
    • 5.4.1 Basic: :todo, :done, :in-progress, :blocked(waiting)
      • with obvious allowed transitions
    • 5.4.2 UI state transitions
      • normal op: todo → done
      • matching (eg): todo → in-progress → done (when acceptance test passes)
      • OK then the failed condition should be explicit: "all files must have metadata" or whatever.
      • Should have manual override (eg declare a step done no matter what)?
    • 5.4.3 Going from :done to :todo
      • Doover, reset affordance
      • should make a new parallel run-step instance
      • (I guess, still not sure how those are going to work)
      • alternatively – keep things simple, that just clears out the run state and artifacts, you can use Datomic history to get old ones if your really must
    • 5.4.4 Oh fuck client/server
      • I guess this means that run should be:
        • compact
        • serialized along with batch
  • 7 Design Review
    • 7.1 Problem
      • We have a lot of data types and each has its own complicated analytical procedure.
      • From design at beginning of the year:
      • ⪢ RawSugar 2.0 should be a system that not only stores data but also tracks provenance and relationships as data is processed through various workflows on external systems. It should serve as a unifying “one stop shop” for viewing, finding, and accessing data in various stages of processing.
      • Helps with
        • reproducibility
        • documentation / knowledge capture (in case any more data scientists leave!)
        • work tracking
    • 7.2 Lacey Recipes
    • 7.3 UI
    • 7.4 Underlying schema
    • 7.5 Machinery / Issues
      • projects
      • sheets
      • files
      • projects
      • batches
      • sheets
      • files
      • projects
      • batches
      • run-steps
      • sheets
      • files
  • 8 Multiple runs
    • From Federico's comments, realizing this is indeed sn important feature and not captured in current prototype
    • How should this work?
      • doesn't quite make sense to have run objects because we want to be able to fork at arbitrary places
      • but it would be easier for users…
      • DONE run-steps will need to have an explicit predecessor link (or equivalent); can't rely on inferring from recipes
      • so there will be multiple instances of a recipe step type for a given batch
      • need some way to pick out "current" set of steps
      • going to be hard to visualize (in part because we already hve branching; can't use a tree )
      • ui: show stacked boxes? And a selector? Eh.
      • default display to show the latest? But you might have disjoint sets…ugh.
    • Use case:
      • user doesn't like some result
      • find the step responsible, clicks on some "Redo" affordance (? better name)
      • system makes new run-steps for that and all downstream steps.
      • UI redisplays to reflect above (not sure how the old steps should appear, but can whiff for now)
    • Guess that's it….
    • Alternatively, reuse the existing run-step objects and just use Datomic history…that means less plumbing to deal with multiple run-steps
    • Maybe have a current? attribute that only applies to one graph? Then query can work as normal. Ugh.
  • {{DONE}} Rearchitecting object system
    • Have a local cache of entities, indexed by eid
    • Need to augment with type probably (and parent links)
    • Means page urls can be simpler
Rawsugar Recipes