magicfile icon وبسایت فایل سحرآمیز - magicfile.ir

دانلود سورس و کد تشخیص زبان یک متن نوشته شده با vb.net

دانلود-سورس-و-کد تشخیص-زبان-یک-متن-نوشته-شده-با-vb.net
توضیحات کوتاه و لینک دانلود
We have prepared the source and language recognition code of a text written with vb.net for you, dear users of the magic file website.

دانلود - Download

لیست فایل های مشابه

لینک کوتاه : https://en.magicfile.ir/?p=2453
توضیحات کامل در مورد فایل

دانلود سورس و کد تشخیص زبان یک متن نوشته شده با vb.net

We have prepared the source and language recognition code of a text written with vb.net for you, dear users of the magical file website. The language recognition solution given is based on n-gram and word occurrence comparison. It is suitable for any language that uses words (this is actually not true for all languages). Depending on the model and the length of the input text, the accuracy is between 70% (only short Norwegian, Swedish and Danish classified by the "all" model) and 99.8% using the "default" model.

زمینه

Language recognition of a written text is probably one of the most fundamental tasks in natural language processing (NLP). For any language depending on the processing of an unknown text, the first thing you need to know is what language the text is written in. Fortunately, this is one of the easier NLP challenges. The approach I have chosen to implement is widely known and very simple. The idea is that each language has a unique set of (co)occurrence characters.

نمونه از تصاویر در زمان اجرا

The first step is to collect those statistics for all the languages ​​that should be recognized. This is not as easy as it may seem at first. The problem is collecting a large set of test data (plain text) that includes only one language and is not domain specific. (Only newspaper articles may lack the use of the "I" word and direct speech. Using Shakespeare's plays would not be the best approach to recognize contemporary texts. Medical articles usually contain many domain-specific terms that are not even language-specific (major , minor, arterial, etc...) and if that's not hard enough, the texts should not be copyrighted. copyrighted?) I chose to use Wikipedia as my main source. I had to do some filtering to "Wikipedia contains many proper names (ie group names) that often contain a 'the' or an 'and' are. That is why those words exist in many languages ​​even if they are not part of the language. This should not necessarily be a disadvantage, as Anglicism has spread widely across many languages. I have three for each language. I made a statistic: Wikipedia contains many proper names (i.e. names of groups) that often contain a “the” or “and.” This is why Those words exist in many languages ​​even if they are not part of the language. This should not necessarily be a disadvantage, as Anglicism has spread widely across many languages. I created three statistics for each language:

  • مجموعه کاراکتر
    • Some languages ​​have a very specific character set (such as Chinese, Japanese, and Russian). For others, some characters give a good hint of the target languages ​​(eg, German Umlauts).
  • N-Grams

    • پس از تبدیل متن به کلمات (در صورت لزوم)، تعداد دفعات 1، 2 و 3 گرم شمارش شد. برخی از n-gram ها بسیار خاص زبان هستند (به عنوان مثال، "TH" در انگلیسی).
  • فهرست واژه

    • A final source of disambiguation is the words that are actually used. Some languages ​​(such as Portuguese and Spanish) are almost identical in the characters used as well as the occurrence of certain n-grams. However, different words are used at different frequencies.

The statistical set is called a model. I have created subsets of the "all" model that best meet my needs (see table below). The "common" model includes the 10 most spoken languages ​​in the world. "Small" and "Default" are based on my usage scenarios. If you are from another part of the world, your preferences may be different. So please don't take offense at my choice of what languages ​​are in which model.

All statistics are sorted and ranked according to their occurrence. In the demo program, all models can be studied in detail. Classification of an unknown text is simple. The text is marked up and three tables are generated for statistics. The result table is compared with all model tables and the distance is calculated. The comparison table of the model that has the smallest distance with the unknown text is most likely the language of the text.

کد زبان زبان کیفیت پیش فرض مشترک بزرگ کوتاه
nl Dutch 13 x x
en English 13 x x x x
ca Catalan 13
fr French 13 x x x x
es Spanish 13 x x x x
no Norwegian 13 x x
da Danish 13 x x
it Italian 13 x x
sv Swedish 13 x x
de German 13 x x x x
pt Portuguese 13 x x x
ro Romanian 13
vi Vietnamese 13
tr Turkish 13 x
fi Finnish 12 x
hu Hungarian 12 x
cs Czech 12 x
pl Polish 12 x
el Greek 12 x
fa Persian 12
he Hebrew 12
sr Serbian 12
sl Slovenian 12
ar Arabic 12 x
nn Norwegian, Nynorsk (Norway) 12
ru Russian 11 x x
et Estonian 11
ko Korean 10
hi Hindi 10 x
is Icelandic 10
th Thai 9
bn Bengali (Bangladesh) 9 x
ja Japanese 9 x
zh Chinese (Simplified) 8 x
se Sami (Northern) (Sweden) 5

برای شما کاربر عزیز پیشنهاد دانلود داده می شود

برای دریافت دانلود سورس و کد تشخیص زبان یک متن نوشته شده با vb.net بر روی لینک زیر کلیک فرمایید

برای دریافت اینجا کلیک کن

فایل های که ممکن است نیاز داشته باشید

نظرات کاربران

کد امنیتی

ارسال کننده نظر مژگان - 2023/2/21 8:57:59 pm
سلام من دانلود کردم به معناي واقعي فوق العاده است
 
پاسخ پشتیبانی فایل سحر آمیز
با سلام سپاسگزارم
 
ارسال کننده نظر مهبد - 2023/1/25 1:58:26 am
با سلام بسيار خوشحال هستم فايل را دانلود کردم هموني هست که دنبالش بودم گفتم يک تشکري بکنم
 
پاسخ پشتیبانی فایل سحر آمیز
با سلام لطف مي کنيد
 
ارسال کننده نظر مهيار - 2022/12/17 12:52:04 am
خسته نباشيد فايل رو دانلود کردم واقعا دستتون درد نکنه هموني بود که دنبالش مي گشتم
 
پاسخ پشتیبانی فایل سحر آمیز
سلام خواهش ميکنيم
 

لیست فایل های ویژه وبسایت

دانلود-نرم-افزار-تغییر-زبان-سورس-و-کد-ویژوال-استودیو-(عناصر-دیزاین-طراحی-فرم-ها)
دانلود نرم افزار تغییر زبان سورس و کد ویژوال استودیو (عناصر دیزاین طراحی فرم ها)

بهترین-سرویس-پوش-نوتیفیکیشن-اسکریپت-مدیریت-اعلان-و-ساخت-پوش-نوتیفیکیشن-سایت-و-ارسال-پوش-از-طریق-php
بهترین سرویس پوش نوتیفیکیشن اسکریپت مدیریت اعلان و ساخت پوش نوتیفیکیشن سایت و ارسال پوش از طریق php

دانلود-نرم-افزار-ترجمه-خودکار-فایل-های-po-,-pot-بصورت-کامل-برای-تمامی-زبان-ها-از-جمله-فارسی
دانلود نرم افزار ترجمه خودکار فایل های po , pot بصورت کامل برای تمامی زبان ها از جمله فارسی

دانلود-نرم-افزار-تبدیل-فایل-متنی-تکست-txt-به-وی‌سی‌اف-vcf-(مخاطب-موبایل)
دانلود نرم افزار تبدیل فایل متنی تکست txt به وی‌سی‌اف vcf (مخاطب موبایل)