I see that the default is set to 0.8 but is there a general rule for what values the resolution should fall between? I know lowering the resolution will mean decreasing the total number of clusters but is there a way to determine if one resolution is better than another?
This is kind of a fuzzy topic in data analysis in general, so I’ll try to summarise what I know about this, and add my opinion at the bottom.
There are roughly two strategies:
Use domain knowledge to determine which number of clusters is correct for that particular dataset and the specific questions to be posed. The number of clusters will then determine a resolution value.
Use a data-driven approach (not currently possible in Cellenics). There are several metrics that evaluate clustering “quality”, such as the Silhouette coefficient, cluster purity, modularity, etc. And you can use those to optimize the clustering parameters. This approach leads to reasonable clustering, but not necessarily to biological insights.
Empirically, we know that values between 0.4-1.2 (according to the Seurat tutorials) returns reasonable results for smallish (and not so small) datasets. And that for bigger datasets it makes sense to increase the resolution value, to be able to discern less common cell types. This is what informed our default value.
In my opinion, the best you can do is change the resolution values and look at your data at various scales, informed by the biology behind your experiment. And Cellenics lends itself really well to this kind of quick iteration. For example: you could increase the resolution, separating some blobs into several clusters and then look at the Heatmap in Data Exploration, to see if the marker genes for the new clusters make sense, or if they are really similar to the initial clustering.