• 0 Posts
  • 192 Comments
Joined 2 years ago
cake
Cake day: January 17th, 2022

help-circle


  • utopiah@lemmy.mltoLinux@lemmy.mlLauncher for Everything*
    link
    fedilink
    arrow-up
    2
    arrow-down
    1
    ·
    4 days ago

    Superficial feedback but I can’t read more than 3 lines without syntax highlighting. Here I believe lines short for the text but makes code even harder to read due to new line. Maybe Codeberg allows for HTML embedding.

    Now for a comment on the content itself, how is that different from aliases in ~/.bashrc? I personally have a bunch of commands that are basically wrapped or shortcuts around existing ones with my default parameters.

    Finally, if the result is visual, like dmenu which I only use a bit in the PinePhone, then please start by sharing a screenshot of the result.

    Anyway, thanks for sharing, always exciting to learn from others how they make THEIR systems theirs!


  • IMHO the question isn’t as much you as a user of such platforms is “f*cked” because you sound both mindful and technically savvy. So, on that front, you will be OK.

    The harder question I would say is how morally bankrupt you will feel by contributing to worsening the privacy of others for profit. Namely that yes by using Facebook/Insta/TikTok/etc you will gain more customers but those customers are gradually losing their privacy while you make those companies bigger by paying them. That means you depend on those companies more while they get more power.

    Because of that I would argue that sure, do everything you can to protect yourself but it can’t stop there. I would argue then than the question is rather, where else can you find more clients, and maybe even “better” clients who are more aligned with your own views on privacy, and maybe even more. It’s definitely a challenge, especially seeing the trend of surveillance capitalism, but as you acknowledge yourself by using Lemmy, there are actual alternatives.



  • utopiah@lemmy.mltoLinux@lemmy.mlDeduplication tool
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    4 days ago

    FWIW just did a quick test with rmlint and I would definitely not trust an automated tool to remove on my filesystem, as a user. If it’s for a proper data filesystem, basically a database, sure, but otherwise there are plenty of legitimate duplication, e.g ./node_modules, so the risk of breaking things is relatively high. IMHO it’s better to learn why there are duplicates on case by case basis but again I don’t know your specific use case so maybe it’d fit.

    PS: I imagine it’d be good for a content library, e.g ebooks, ROMs, movies, etc.




  • utopiah@lemmy.mltoLinux@lemmy.mlDeduplication tool
    link
    fedilink
    arrow-up
    1
    arrow-down
    1
    ·
    4 days ago

    I don’t actually know but I bet that’s relatively costly so I would at least try to be mindful of efficiency, e.g

    • use find to start only with large files, e.g > 1Gb (depends on your own threshold)
    • look for a “cheap” way to find duplicates, e.g exact same size (far from perfect yet I bet is sufficient is most cases)

    then after trying a couple of times

    • find a “better” way to avoid duplicates, e.g SHA1 (quite expensive)
    • lower the threshold to include more files, e.g >.1Gb

    and possibly heuristics e.g

    • directories where all filenames are identical, maybe based on locate/updatedb that is most likely already indexing your entire filesystems

    Why do I suggest all this rather than a tool? Because I be a lot of decisions have to be manually made.












  • My documented process https://fabien.benetou.fr/Content/SelfHostingArtificialIntelligence but honestly I just tinker with this. Most of that isn’t useful IMHO except some pieces, e.g STT/TTS, from time to time. The LLM aspect itself is too unreliable, and I do like 2 relatively recent papers on the topic, namely :

    which are respectively saying that the long-tail makes it practically impossible to train AI to be correct in rare cases and that “hallucinations” are a misnomer for marketing purposes to be replaced instead by “bullshit” used to convinced people without caring for veracity.

    Still, despite all this criticism it is a very popular topic, hyped up to be the “future” of computing. Consequently I did want to both try and help others to do so rather than imagine that it was restricted to a kind of “elite”. I try to keep the page up to date but so far, to be honest, I do it mostly defensively, to be able to genuinely criticize because I did take the time to try, not reject in block.

    PS: I do try also state of the art, both close and open-source, via APIs e.g OpenAI or Mistral but only for evaluation purposes, not as tools part of my daily usage.