SenNet + HOA - Hacking the Human Vasculature in 3D
Segment vasculature in 3D scans of human kidney
SenNet + HOA - Hacking the Human Vasculature in 3D
[lb0.870 !!!] experiment results, hopefully open gold solution till 21-jan-2024
more to come later โโโฆ. ใ คใ คใ คใ คใ คใ คโโโโ โโโโ โโโ โโโโ ใ ค ใ คใ คใ คใ คใ คโโโโ โโโโ โโโ โโโโ โโ
๐ฆ nov-21:
- a simple baseline notebook to check reading of images and model inferenceใ
คใ
คใ
คโโโโ โโโโ โโโ โโโโ ใ
ค
 https://www.kaggle.com/code/hengck23/lb0-534-baseline-simple-unet-seresnext26d-32x4
๐ฆ nov-28:
- advanced baseline with more augmentation and 3d post-processing ใ
คใ
คใ
คโโโโ โโโโ โโโ โโโโ ใ
ค
 https://www.kaggle.com/code/hengck23/lb0-808-resnet50-cc3d2d-unet-xy-zy-zx-cc3d
- this concluded my one-week feasibility study:- estimated upperbound: lb 0.93
- identified key problems: miss (broken large vessels, missing small vessels), fp (noise, sometimes out of kidney volume)
- solutions: multiscale and 3d model (i think 3d transformer/hybrid/cnn should work well, need long range attention)
 
i am obviously? overfitting the public LB data โฆ
here is a visualization of the target vessel we are segmenting:
 
obviously, 3d segmentation will get better results. you probably can't just rely on deep net, and need some very smart 3d image post processing to win.
107 Comments
hengck23
Posted 14 days ago
ยท 3rd in this Competition
wrong strategy?
i have been trying to detect the small vessels to improve my LB score becuase it is stated in the paper that this is one of the problem.
But then i realize that this may not be a good strategy?
i use train= kidney1 dense + kidney3 dense
validation= kidney2 sparse (from slice >= 900 to avoid annotation shift error)
i first note the recent popular public kernel based on seresnext50_32x4d 2dunet which can easily object lb 0.859. seresnext50_32x4d has imagenet top1 in range of ~0.79. My experiments shows that while seresnext50_32x4d 2dunet has low hit rate (>0.83), it has very low fp rate (~0.111) as well. i am surprise that it can get 0.864 lb single model.
i have been using nextVIT base transformer, imagenet top1 ~0.82, to get lb 0.863. It has very high hitrate (>0.93) and low fp rate (~0.183).
it turns out that since lb is based on surface pixels, small vessel has less surface (and contribute little to lb score) !!!
e.g. for kidney3 sparse (85% annotation) has surface dice 0.98 compared to kidney3 dense
in other words, it is wrong choice of metric if the host interest is to detect small vessel
hengck23
Posted 14 days ago
ยท 3rd in this Competition
update on 06-JAN
trick to get lb 0.87+
- make a base model in the range of 0.864 (e.g. either seresnext50_32x4d or nextVIT base 2d unet) 
- ensemble of the two will give 0.87+ 
- ensemble improve lb score: 
- detect more instance
- for an already detected instance improve its boudary to improve the dice score for that instance
- reduce fp
Posted 13 days ago
ยท 372nd in this Competition
@hengck23 How do you compute your FPR ? Your definition probably differs from the one I'm used to regarding normalization.
Do you use (pred * (1 - truth)).sum() / truth.sum() ?
Same for TPR, you are using (pred * truth).sum() / truth.sum(), right ?
Thanks!
hengck23
Posted 13 days ago
ยท 3rd in this Competition
def np_metric(predict, truth):
    p = (predict>0.5)
    t = (truth>0.5)
    hit = (p*t).sum()
    fp  = (p*(1-t)).sum()
    t_sum = t.sum()
    p_sum = p.sum()
    return hit, fp, t_sum, p_sum
#-------------------------------------
predict = prob>0.5
hit, fp, t_sum, p_sum = np_metric(predict, truth)
hit = hit/t_sum
fp = fp/p_sum
precision and recall are better words
hengck23
Posted 18 days ago
ยท 3rd in this Competition
i realize something important.
i checked the fast surface dice loss and i think we can directly optimize this loss in gradient descent!
i) use 3 channel input and predict 3 channel output
ii) refer to the fast surface dice code by @junkoda . you can apply same marching cube algorithm (2x2x2) to compute soft surface dice loss
instead of unfold 3d conv with 256 kernels may be slightly faster
this may be important because i realize the reason for missing small vessel is annotation error (even for the 100% dense label, not all smallest vessel are annotated). this causes the following
- surface dice metric is fluctuating up and down during the training iteration
- 3d, 2.5d results are poorer than pure 2d
- confidence for smallest vessel are the worst (because of inconsistent label). At first i thought is this because of the small area (since deep learning minimize BCE loss for largest area first), but then i think inconsistent label has a larger effect.
Posted 18 days ago
ยท 372nd in this Competition
From my understanding the surface dice metric requires thresholding of predictions hence we cannot directly optimize it.
Applying the metric on probabilities does not seem to yield meaningful results but I did not dig very deeply nor look into the math.
hengck23
Posted 18 days ago
ยท 3rd in this Competition
here is the trick:
(it works because we are using tolerance =0)
if tolerance is not zero, then we have to modify the softmax loss to KL divergence loss where the ground truth probabiliy = 1 - surface distance
hengck23
Posted 18 days ago
ยท 3rd in this Competition
did we just find the missing labels (for sparsely annotated ground truth labels)?
Posted 18 days ago
ยท 29th in this Competition
GJ. I believe both sparse and dense have labeling issues (imho, cuz I`m not a doctor). Looked through and found even pretty big vessels missed. Thx for sharing.
hengck23
Posted 17 days ago
ยท 3rd in this Competition
yes you are correct.
an easy way to prove label error is to check train error after overfitting.
- slowly increase the network parameter (e.g. resnet18,34,50,101 โฆ)
- train with small datsaet until overfit
now if the train error cannot get to zero, waht can be the reason?
- not enough parameter (can be verify by increasing parameter)
- wrong label (can be verify by visualisation, i.e. draw the fp and miss)
hengck23
Posted a month ago
ยท 3rd in this Competition
20-dec:
recepie for LB 0.848
- use transformer (e.g. nextVIT base)
- use train = kidney1/dense + kidney3/dense
- use validate = kidney2
- just normal unet2d (with a additional encoder layer for H,W and H/2,w/2)
- infer with the usual xy,zx,zy + 5 TTA
details at the other posts below โฆ.
i am surprised that it can get LB 0.862 by training for long iterations (but CV does not reflect the change โฆ maybe because of wrong annotation on kidney2 and the segmentation sparsity (i.e. incomplete label)
i.e. there is probably no reliable local validation set in this competition
Posted 24 days ago
ยท 553rd in this Competition
Hi, I don't understand what does xy,zx,zy mean๏ผcould you please explain it, thanks!
Posted a month ago
ยท 3rd in this Competition
How long does it take for you to inference with Transformer, it seems that P100 without fp16 inference is very slow for Transformer, do you have any suggestions for this?
hengck23
Posted a month ago
ยท 3rd in this Competition
you can google for fast transformer. i use nextvit, which was used in kaggle rnsa breast mamnographs before. it takes 3to4 hr and can be speedup further by tensort rt to 3 hr
hengck23
Posted a month ago
ยท 3rd in this Competition
update: nextVIT-base can get LB0.682 3hr 20 min (just complete in few minutes along so i can get the timing in minutes).
it uses 5xTTA + 3 axis (xy,yz,xz) in P100
hengck23
Posted a month ago
ยท 3rd in this Competition
correct way to do validation:
- just validate on kidney3 is not enough.
- another test (just test only, don't use it to true hyper-parameters) on kidney2 is probably more accurate
- low fp seems to be the key for good LB score. for me fp=0.05 is optimal for LB.
- transformer seems to be a "much" better model (maybe more rubust against salt noise,etc see segformer paper)
NOTE:
- all models below have local CV of 0.90 for kidney3 (dense). But when it comes to kidney2, results are different, as hsown in the table below.
- all models are trained on kidney1 (dense) only
- top upper half of kidney2 has annotation error (shift) โฆ see anotehr post below
hengck23
Posted a month ago
ยท 3rd in this Competition
another tip:
my previous two winning kaggle solution has huge shake up. I could win because i kept to the follwing principles. instead of the best parameter (e.g. threshold) for the solution:
- try to improve your solution so that it is not sensitive to thresholds, i.e. performance is optimal over a range of threshold. e.g. best AP verus MAP
- try to improve your solution so that it is not sensitive to data, i.e. there is low variance in accuracy for different validation samples
in summary, in addition to best results, think of ways to
- measure variance
- reduce variance
hengck23
Posted a month ago
ยท 3rd in this Competition
just an idea:
takes 2 nearby slices and generate vessels in between, both image and mask
hengck23
Posted a month ago
ยท 3rd in this Competition
https://www.kaggle.com/code/junkoda/fast-surface-dice-computation
based on the fast surface dice computation by @junkoda, i did an extensive study on the effect of threshold and the local lb metric. For each model after the training epoch, i made measurements.
conclusion:
- TTA and TTA+xy,zx,zy always improve surface-dice, even if the model is under or over-fitted.
- there are two optimum:- early stopping (high threshold >0.5)
- just before over fitted (0.2 to 0.3 threshold)
 
- BCE loss seems not suitable. both train and validation loss are decreasing steadily. But surface-dice are moving up and down. One may want to take a look at e.g. edge loss, Hausdorff Distance loss, etc
Posted a month ago
ยท 252nd in this Competition
Hello, may I ask how you do multi-view training during the training process, is it cycling three axes training in one epoch? I put the data of the three views into a dataset, set the batchsize to 1, and shuffle it, but it doesn't seem to work well. Could you please make a reply? Thank you very much
hengck23
Posted a month ago
ยท 3rd in this Competition
def make_train_id(
    meta_data=DATA_META,
    name = ['kidney_1_dense',],
):
    train_id=[]
    for n in name:
        d = meta_data[n]
        D,H,W = d.image.shape
        train_id += [ (d.name, i, 'y') for i in range(H) ]
        train_id += [ (d.name, i, 'z') for i in range(D) ]
        train_id += [ (d.name, i, 'x') for i in range(W) ]
    return train_id
train_id = make_train_id(...)
class HiPDataset(Dataset):
    def __init__(self, sample_id=train_id, augment=None):
        self.sample_id = sample_id
        self.augment = augment
        self.length = len(self.sample_id)
        unique_name=[]
        for name,i,axis in sample_id:
            if not name in unique_name:
                unique_name.append(name)
        self.unique_name=sorted(unique_name)
    def __str__(self):
        string = ''
        string += f'\tlen = {len(self)}\n'
        string += f'\tunique_name = {self.unique_name}\n'
        return string
    def __len__(self):
        return self.length
    def __getitem__(self, index):
        name,i,axis = self.sample_id[index]
        d = DATA_META[name]
        D,H,W = d.image.shape
        if axis == 'z':
            image = d.image[i]
            vessel = d.vessel[i]
        if axis == 'y':
            image = d.image[:, i]
            vessel = d.vessel[:, i]
        if axis == 'x':
            image = d.image[:, :, i]
            vessel = d.vessel[:, :, i]
        image  = np.ascontiguousarray(image)
        vessel = np.ascontiguousarray(vessel)
        if self.augment is not None:
            image, vessel = self.augment(image, vessel)
        image  = np.ascontiguousarray(image)
        vessel = np.ascontiguousarray(vessel)
        #---
        r = {}
        r['index'] = index
        r['sample_id' ] = (name,i,axis)
        r['image' ] = torch.from_numpy(image).float()
        r['vessel'] = torch.from_numpy(vessel).float()
        return r
hengck23
Posted a month ago
ยท 3rd in this Competition
trick to get lb 0.835
hengck23
Posted a month ago
ยท 3rd in this Competition
    def forward(self, batch):
        x = batch['image']
        x = x.expand(-1,3,-1,-1) 
        B, C, H, W = x.shape
        encode = []
        xx = self.stem0(x); encode.append(xx)
        xx = F.avg_pool2d(xx,kernel_size=2,stride=2)
        xx = self.stem1(xx); encode.append(xx)
        e = self.encoder #convnext
        x = e.stem(x);
        x = e.stages[0](x); encode.append(x)
        x = e.stages[1](x); encode.append(x)
        x = e.stages[2](x); encode.append(x)
        x = e.stages[3](x); encode.append(x)
        ##[print(f'encode_{i}', e.shape) for i,e in enumerate(encode)]
        last, decode = self.decoder(
            feature=encode[-1], skip=encode[:-1][::-1]
        )
        ##[print(f'decode_{i}', e.shape) for i,e in enumerate(decode)]
        ##print('last', last.shape)
        vessel = self.vessel(last)
hengck23
Posted a month ago
ยท 3rd in this Competition
example of simple 3d flood fill
hengck23
Posted a month ago
ยท 3rd in this Competition
for iteration in range(100):
 # repeat grow up, grow down,   grow up, grow down, ....
 for t in range(750,0,-1): 
    print(t)
    prev_mask  = dense[t+1]
    prev_image = image[t+1]
    curr_image = image[t]
    m = curr_image[1:-1,1:-1]
    prev = prev_mask*prev_image
    th=2
    diff = 0
    diff += np.abs(m-prev[1:-1,1:-1])<th #[ 0, 0]
    diff += np.abs(m-prev[2:  ,1:-1])<th #[ 1, 0]
    diff += np.abs(m-prev[0:-2,1:-1])<th #[-1, 0]
    diff += np.abs(m-prev[1:-1,0:-2])<th #[ 0,-1]
    diff += np.abs(m-prev[2:  ,0:-2])<th #[ 1,-1]
    diff += np.abs(m-prev[0:-2,2:  ])<th #[-1,-1]
    diff += np.abs(m-prev[1:-1,2:  ])<th #[ 0, 1]
    diff += np.abs(m-prev[2:  ,2:  ])<th #[ 1, 1]
    diff += np.abs(m-prev[0:-2,2:  ])<th #[-1, 1]
    grow = diff>1
    #todo: grow across plane here ...
    predict += grow
    #image_show_norm('predict',predict)
    image_show_norm('predict',(predict>0).astype(np.float32))
    cv2.waitKey(0)
hengck23
Posted a month ago
ยท 3rd in this Competition
lb0.829 โฆ
normalisation is the key
def norm_by_percentile(volume, low=10, high=99.8, alpha=0.01):
    xmin = np.percentile(volume,low)
    xmax = np.percentile(volume,high)
    x = (volume-xmin)/(xmax-xmin)
    if 1:
        x[x>1]=(x[x>1]-1)*alpha +1
        x[x<0]=(x[x<0])*alpha
    #x = np.clip(x,0,1)
    return x
normalised by volume/subvolume, not image
Posted a month ago
ยท 95th in this Competition
@hengck23 - is this the right way to read 3D volume because i get negative stride error
class BuildDataset(torch.utils.data.Dataset):
def __init__(self, img_voxel_paths, msk_voxel_paths=[], transforms=None):
    self.img_voxel = create_3d_voxel(img_voxel_paths)  # Load the 3D voxel for images
    self.msk_voxel = create_3d_voxel_msk(msk_voxel_paths) if msk_voxel_paths else None  # Load the 3D voxel for masks
    self.transforms = transforms
def __len__(self):
    return self.img_voxel.shape[0]  # Number of slices in the voxel
def __getitem__(self, index):
    img = self.img_voxel[index, :, :].copy()  # Extract the 2D slice from the image voxel
    img = np.expand_dims(img, axis=0)  # Add channel dimension
    if self.msk_voxel is not None:
        msk = self.msk_voxel[index, :, :].copy()  # Extract the 2D slice from the mask voxel
        msk = np.expand_dims(msk, axis=0)  # Add channel dimension
        if self.transforms:
            data = self.transforms(image=img, mask=msk)
            img = data['image']
            msk = data['mask']
        return torch.tensor(img, dtype=torch.float32), torch.tensor(msk, dtype=torch.float32)
    else:
        if self.transforms:
            data = self.transforms(image=img)
            img = data['image']
        return torch.tensor(img, dtype=torch.float32)
hengck23
Posted a month ago
ยท 3rd in this Competition
def do_random_flip_rotate(image, vessel):
    if np.random.rand()<0.5:
        image  = np.flip(image, axis=1) #horizontal
        vessel = np.flip(vessel,axis=1)
    if np.random.rand()<0.5:
        image  = np.flip(image, axis=0)
        vessel = np.flip(vessel,axis=0)
    if np.random.rand()<0.5:
        k = np.random.choice([1,2,3])
        image  = np.rot90(image, k, axes=[0,1])
        vessel = np.rot90(vessel,k, axes=[0,1])
    image = np.ascontiguousarray(image)
    vessel = np.ascontiguousarray(vessel)
    return image, vessel
hengck23
Posted a month ago
ยท 3rd in this Competition
high resolution 2d/3d unet solution !!!
demo notebook :https://www.kaggle.com/code/hengck23/2d-to-3d-unet-demo
hengck23
Posted a month ago
ยท 3rd in this Competition
how to properly design 3d solution.
if you cannot fit a single 3d net work (because of memory GPU constraint), 
you properly need at least 2 net (of different hierarchy scale).
Posted a month ago
ยท 720th in this Competition
posting it here for visibility Patch based 3D UNET + TF + TPU: https://www.kaggle.com/code/dhinkris/training-patch-based-3d-unet-tf-tpu
hengck23
Posted a month ago
ยท 3rd in this Competition
you should try these pretrain 3d unet
https://github.com/Tencent/MedicalNet
Posted a month ago
ยท 95th in this Competition
@hengck23 how do you stack the 2Dslice to 3D in kaggle inference , i get memory overflow error
hengck23
Posted a month ago
ยท 3rd in this Competition
i don't train in kaggle. i use my local machine
hengck23
Posted a month ago
ยท 3rd in this Competition
stack as uint16 or uint8.
stack only one at a time (not both kidney 6,5)
Posted a month ago
ยท 720th in this Competition
take a look at this thread: https://www.kaggle.com/competitions/blood-vessel-segmentation/discussion/453986
hengck23
Posted a month ago
ยท 3rd in this Competition
this is why the vessel are detected as broken
Posted 12 days ago
ยท 384th in this Competition
@hengck23 The vessels only look like a hole when viewed by slices, I suppose we actually won't see the faded/broken vessels in the slice (your image on the left), is that right? And the stuff inside, did you find that's for broken vessels only or it's generic?
hengck23
Posted a month ago
ยท 3rd in this Competition
[1] Memory transformers for full context and high-resolution 3D Medical Segmentation
https://arxiv.org/pdf/2210.05313.pdf
"Combined, they allow full attention over high resolution images, e.g. 512 x 512 x 256 voxels and above. Experiments on the BCV image segmentation dataset shows better performances than state-of-the-art CNN and transformer baselines, โฆ."
hengck23
Posted 2 months ago
ยท 3rd in this Competition
IMPORTANT!!!!!
This is the black magic and the trick to winning!!!!
results of 3d flood fill with seed = one pixel
hengck23
Posted 2 months ago
ยท 3rd in this Competition
this reminds me of 3d SAM (segment anything model)
if so, we can use self-supervised learning to train the encoder 
the seed will be prompt.
the decoder is to perform floodfill segmentation
hengck23
Posted 2 months ago
ยท 3rd in this Competition
volumetric rendering of image volume
it shows the uniform intensity of walls of the vessel. that is why flood fill works well
hengck23
Posted 2 months ago
ยท 3rd in this Competition
kidney mask dataset
https://www.kaggle.com/datasets/hengck23/blood-vessel-segmentation-kidney-mask
 
Posted 2 months ago
ยท 433rd in this Competition
Thanks for putting this together, I was just looking at doing something like this yesterday before getting pulled in to holiday shenanigans. I re-scaled the data back to the full image size, RLE encoded it, and saved it back to the training csv for anyone interested in using it without resizing on load: https://www.kaggle.com/datasets/squidinator/sennet-hoa-kidney-13-dense-full-kidney-masks
hengck23
Posted a month ago
ยท 3rd in this Competition
why you need self-supervised learning
_50um_LADAF-2020-31_kidney : 1644x1108,2141
hengck23
Posted 2 months ago
ยท 3rd in this Competition
wrong annotation !!!!!
hengck23
Posted 2 months ago
ยท 3rd in this Competition
i think there is offset bug in annotion for kidney2
hengck23
Posted a month ago
ยท 3rd in this Competition
shift error happens in hidden test datat ground truth before in kaggle competitionsโฆ
anyone want to probe that?
hengck23
Posted a month ago
ยท 3rd in this Competition
only top half of kidney 2 annotation are shifted
TOP:
BOTTOM:
    pl = pv.Plotter()
    mhit = pv.PolyData(np.stack(np.where(hit > 0.1)).T).glyph(geom=pv.Cube())
    mfp = pv.PolyData(np.stack(np.where(fp > 0.1)).T).glyph(geom=pv.Cube())
    mmiss = pv.PolyData(np.stack(np.where(miss > 0.1)).T).glyph(geom=pv.Cube())
    pl.add_mesh(mhit, color='yellow')
    pl.add_mesh(mfp, color='green')
    pl.add_mesh(mmiss, color='red')
    pl.show()
hengck23
Posted a month ago
ยท 3rd in this Competition
i think this is a good pretrain/self-supervised or aux loss:
input vol(d,h,w)
predict mask(d,h,w)
aux target vol(d+delta,h,w)
predict how the volume will extrapolate
(this requires the model to "understand vessel" and grow them)
Posted a month ago
ยท 343rd in this Competition
good idea๐
hengck23
Posted a month ago
ยท 3rd in this Competition
by measuring the aux error of hidden kidney5 and kidney6, we can also know if these can similar to train data or not :)
if we have enough time, online fine tuning is possible
hengck23
Posted a month ago
ยท 3rd in this Competition
treat extrapolate as image prompting problem
Sequential Modeling Enables Scalable Learning forLarge Vision Models
https://yutongbai.com/lvm.html
hengck23
Posted a month ago
ยท 3rd in this Competition
if the holding cylinder is the same, this gives away the size and voxel resolution of the object
Posted a month ago
ยท 3rd in this Competition
Thank you very much for your sharing, based on your methods and information, here are the experimental data on my side. 
By the way, how did you generate the kidney's mask? I haven't used it yet.
hengck23
Posted a month ago
ยท 3rd in this Competition
" you generate the kidney's mask?"
i label a few and train a model to label the rest.
you should try the kaggle metric (surface dice) on your validation set as well.
iou is not good enough to capture cv/lb correlation
try lower threshold as well